Understanding CMS and G1 Garbage Collection: Strategies, STW, and Performance Trade‑offs
This article explains the inner workings of Java's CMS and G1 garbage collectors, detailing their four-phase processes, the need for stop‑the‑world pauses, strategies like incremental update and SATB to handle missed marks, and compares their advantages, drawbacks, and suitable replacement scenarios.
Garbage Collection
CMS Garbage Collection
CMS collector is a concurrent‑mark‑sweep GC aiming for minimal pause time, the first true concurrent collector in the JVM. It consists of four phases:
Initial Mark (STW) – pause all application threads and quickly mark objects directly reachable from GC roots.
Concurrent Mark – traverses the object graph without stopping the world; records references from black to white objects for later re‑marking, which can cause floating garbage and missed‑mark bugs.
Remark (STW) – uses an incremental update algorithm to re‑mark objects that were missed during concurrent marking.
Concurrent Sweep – reclaims memory occupied by white (unreachable) objects.
Issues with CMS include double‑marking (floating garbage) and missed‑mark bugs.
G1 Garbage Collection
G1 is a full‑generation, default collector since JDK 9, based on a mark‑compact algorithm implemented as a copying collector. Its four phases mirror CMS:
Initial Mark (STW) – quickly marks objects reachable from GC roots.
Concurrent Mark – performs reachability analysis; uses SATB (snapshot‑at‑beginning) to record reference changes and avoid missed marks.
Final Mark (STW) – processes SATB records, turning disconnected white objects gray for a final scan.
Live Data Counting and Evacuation (STW) – updates region statistics, selects a collection set (CSet) based on desired pause time, then copies live objects to free regions.
G1’s advantages: predictable pause‑time model, whole‑heap collection without fragmentation, and automatic young‑generation sizing. Drawback: higher memory overhead (≈10‑20 % of heap).
Strategies for Three‑Color Marking Missed‑Mark Problems
Incremental Update – during concurrent marking, record references from black to white objects; during remark, pause the world, turn those black objects gray, and rescan.
SATB – record objects that lose references (white objects) during concurrent marking; in final marking (STW) treat them as gray for a second scan.
Golang Approach – uses insertion and deletion barriers, or a mixed write barrier, to keep newly allocated objects gray.
Why STW Is Required in Certain Phases
Initial marking must be STW because without a pause the set of root references would change continuously, making the marking boundary undefined. The “live data counting and evacuation” phase also needs STW to safely compute region statistics and copy live objects without interference.
Handling New Objects During Concurrent Marking
New objects may be directly reachable from GC roots, referenced by black objects, or by white/gray objects. If treated as white, they risk being reclaimed; most collectors (e.g., Go 1.8, CMS, G1) mark new objects gray so they are scanned, avoiding missed‑mark bugs at the cost of possible over‑marking.
G1 Algorithm Summary and Replacement Scenarios
G1 employs a mark‑compact (copying) algorithm and provides Young GC and Mixed GC. When mixed GC cannot keep up, a Serial Old (Full GC) is triggered. G1 replaces CMS in high‑concurrency workloads because it automatically adjusts young‑generation size and offers predictable pauses, reducing Full GC occurrences.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.