What Are the 7 Core Garbage Collection Algorithms and When to Use Them?
This article explains why garbage collection (GC) is essential, defines key GC terminology, compares performance metrics, and provides a detailed overview of seven common GC algorithms—including mark‑sweep, reference counting, copying, mark‑compact, conservative, generational, and incremental GC—along with their advantages, disadvantages, and typical improvements.
Why GC Exists
GC (Garbage Collection) automatically reclaims memory that programs no longer need, eliminating manual memory management errors such as leaks, dangling pointers, and invalid frees.
Basic GC Terminology
Objects are memory blocks with a header and fields. Roots are entry points for object graphs. Active objects are reachable from roots; inactive objects become garbage.
GC Performance Metrics
Throughput – ratio of user code time to total time.
Maximum pause time – longest pause caused by GC.
Heap utilization – amount of heap used per unit time.
Locality – ability to keep related objects close in memory.
1. Mark‑Sweep Algorithm
Consists of a marking phase (traverse and mark reachable objects) and a sweeping phase (reclaim unmarked objects). Advantages: simple, easy to combine with other algorithms. Disadvantages: fragmentation, slower allocation, incompatible with copy‑on‑write.
Improvements
Multiple free lists to speed allocation.
BiBOP (Big Bag Of Pages) to reduce fragmentation.
2. Reference Counting
Each object tracks the number of references to it. When the count drops to zero, the object is reclaimed immediately.
Advantages
Immediate reclamation.
Short pause times.
No need to traverse the object graph.
Disadvantages
Overhead of updating counters.
Requires extra bits per object.
Cannot reclaim cyclic references.
Improvements
Deferred reference counting (ZCT) to batch reclamation.
Sticky reference counting to limit counter size.
3. Copying (Copy‑On‑Collect) Algorithm
Divides the heap into two equal semispaces (From and To). Live objects are copied from From to To; then the spaces are swapped.
Advantages
High throughput.
Fast allocation (bump pointer).
No fragmentation.
Improves cache locality.
Disadvantages
Uses only half of the heap.
Incompatible with conservative GC.
Recursive copying can cause stack overflow.
Improvements
Cheney’s algorithm – breadth‑first copying using scan and free pointers.
Multi‑space copying – uses several semispaces to increase usable heap.
4. Mark‑Compact Algorithm
Combines marking with compaction: after marking live objects, they are moved to eliminate gaps.
Advantages
Full heap utilization.
No fragmentation.
Disadvantages
Three full‑heap scans increase cost.
Lower throughput than pure copying.
Improvements
Two‑finger algorithm – moves objects with two pointers, requiring equal‑size objects.
5. Conservative GC
Treats any value that looks like a pointer as a live reference, avoiding false reclamation.
Advantages
Easy to integrate into languages without precise type information.
Disadvantages
May retain unreachable objects, increasing heap pressure.
Cannot move objects, limiting algorithm choices.
Improvements
Exact GC – requires language support to distinguish pointers.
Indirect references (handles) – enable object movement.
MostlyCopyingGC – moves only objects that are safely identifiable.
6. Generational GC
Exploits the observation that most objects die young. The heap is split into young (nursery) and old generations. Minor GC collects the young generation frequently, while major GC collects the old generation less often.
Advantages
Reduces overall GC time (often 1/4 of full‑heap GC).
Disadvantages
May suffer if many objects survive long.
Requires write barriers to track old‑to‑young references.
Improvements
Multi‑generational GC – adds more generations to reduce promotion pressure.
7. Incremental (Tri‑Color) GC
Interleaves GC work with program execution using three colors (white, gray, black). Write barriers ensure that black objects never point to white objects.
Advantages
Limits maximum pause time, suitable for latency‑sensitive applications.
Disadvantages
Additional overhead from barriers.
Improvements
Steele’s write barrier – marks the source object gray when it points to a white/gray target.
Deletion barrier – marks deleted objects gray.
Summary
Memory is a contiguous space where objects are allocated and reclaimed. No single GC algorithm is perfect; each has trade‑offs. Understanding the seven core algorithms helps engineers choose the most appropriate strategy for their workload, balancing throughput, pause time, fragmentation, and implementation complexity.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.