How Go’s New Green Tea GC Slashes CPU Overhead by Up to 40%

The article examines Go 1.25’s experimental Green Tea garbage collector, explains why the traditional mark‑sweep approach hurts modern CPUs, details the page‑oriented redesign and its AVX‑512 vector acceleration, and shows how these changes can cut GC‑related CPU usage by 10‑40%.

BirdNest Tech Talk
BirdNest Tech Talk
BirdNest Tech Talk
How Go’s New Green Tea GC Slashes CPU Overhead by Up to 40%

Introduction: The Hidden Cost of Convenience

Go developers often see 20% or more of CPU time spent in garbage collection (GC), a price paid for easy memory management. The article asks what would happen if a simple idea could eliminate most of this overhead.

1. Traditional Approach: A "Micro‑architectural Disaster"

The classic Go GC uses a mark‑sweep algorithm that performs a "graph flood" starting from root objects (e.g., globals) and marking every reachable object. This requires the CPU to jump around memory to follow pointers, leading to severe cache misses because the algorithm does not ensure that mutually referenced objects reside near each other in memory.

As modern CPUs rely on predictable cache behavior and as hardware trends such as NUMA architectures and reduced per‑core memory bandwidth intensify, this pointer‑chasing traversal becomes increasingly inefficient.

"The graph‑flood algorithm is like driving through a city street maze; the CPU cannot anticipate the next turn, so even a fast engine cannot accelerate."

2. A Surprisingly Simple Solution: "Page‑Centric, Not Object‑Centric"

Green Tea’s core idea is to work at the page level rather than the object level. In practice, the GC stops tracking individual objects and instead tracks and scans whole memory pages.

Returning to the driving metaphor, this is like leaving the congested city streets for a highway: the scanner now performs fewer, longer left‑to‑right sweeps that align with CPU cache lines. Objects that are physically close are more likely to be scanned together, dramatically improving cache utilization and keeping page metadata in cache.

This redesign reduces GC‑related CPU cost by 10%‑40% across many workloads.

3. Unlocking the Hardware "Swiss‑Army Knife"

The page‑oriented method also enables the GC to exploit modern vector hardware, which was impossible with the irregular pattern of the old graph traversal.

Specifically, it leverages AVX‑512 instructions on modern x86 CPUs.

One key instruction is VGF2P8AFFINEQB, part of the x86 Galois‑Field New Instructions (GFNI) extension, dubbed the "Swiss‑army knife of bit operations" because it accelerates a critical step in the scanning process.

These vector enhancements are expected to shave an additional ~10% off GC CPU time.

4. A Seven‑Year Journey of a "Simple Idea"

Green Tea did not spring from a single flash of genius; it is the culmination of collective work dating back to 2018. The name originates from 2024 when Austin Clements, while hopping between cafés in Japan and drinking large amounts of matcha, prototyped the core mechanism.

This prolonged collaborative effort allowed the team to explore a complex design space and ultimately prove that the simple page‑centric concept is viable.

5. Try It Now: Enabling Green Tea

Green Tea is currently an experimental feature in Go 1.25. Developers can enable it with a single environment variable at build time: GOEXPERIMENT=greenteagc The Go team plans to make Green Tea the default GC in Go 1.26, including the vector‑acceleration discussed above. When it becomes default, developers can still opt out with:

GOEXPERIMENT=nogreenteagc

Conclusion: A New Way of Thinking

The breakthrough of Green Tea lies in shifting the perspective from object‑centric to page‑centric garbage collection. By aligning the software algorithm with the actual behavior of modern hardware, this modest conceptual change unlocks substantial performance gains.

PerformanceGogarbage collectionavx-512experimentalgreen-tea
BirdNest Tech Talk
Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.