Fundamentals of Java Garbage Collection and JVM GC Algorithms
This article explains Java's automatic garbage collection, describes common GC algorithms such as reference counting, mark‑sweep, copying, mark‑compact, incremental and generational collection, classifies JVM collectors, compares their performance impact, and summarizes key JVM GC parameters.
Garbage Collection Basics
Java provides automatic garbage collection, relieving developers from manual memory management, but it also adds overhead to the system.
Unlike C++, where developers must explicitly allocate and free memory, Java's GC handles object reclamation, avoiding memory leaks that can crash programs.
Common GC Algorithms
Reference Counting
Reference counting increments a counter for each reference to an object and decrements it when references disappear; when the count reaches zero, the object can be reclaimed. It cannot handle cyclic references, so Java does not use it.
Example of a cyclic reference: objects A and B reference each other, making their counters non‑zero even though no external references exist, leading to a leak.
Mark‑Sweep
Mark‑Sweep first marks all reachable objects from root references, then sweeps (deletes) unmarked objects. It can cause memory fragmentation because reclaimed space may be non‑contiguous.
Copying
The heap is split into two equal regions; live objects are copied from the active region to the free region during GC, after which the roles of the regions are swapped. This yields compact memory but halves usable heap size.
Java's young‑generation serial collector uses this idea with Eden, From, and To (survivor) spaces.
Mark‑Compact
Mark‑Compact combines marking with moving all live objects to one end of the heap, eliminating fragmentation without needing a second region.
Incremental Collecting
Incremental GC interleaves short GC pauses with application execution to reduce pause times, at the cost of higher overall overhead.
Generational Collecting
Generational GC divides the heap into young and old generations, applying different algorithms (e.g., copying for young, mark‑compact for old) based on object age to improve efficiency.
GC Types and Their Characteristics
GCs can be classified by thread count (serial vs parallel), work mode (concurrent vs stop‑the‑world), fragmentation handling (compact vs non‑compact), and memory region (young vs old).
Key performance metrics include throughput, GC load, pause time, frequency, response time, and heap allocation strategy.
JVM Garbage Collector Classification
Serial Young‑Generation Collector
Uses a single thread and stop‑the‑world pauses; employs the copying algorithm. Example log:
[GC [DefNew: 3468K->150K(9216K), 0.0028638 secs][Tenured: 1562K->1712K(10240K), 0.0084220 secs] 3468K->1712K(19456K), [Perm : 377K->377K(12288K)], 0.0113816 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]Serial Old‑Generation Collector
Uses the mark‑compact algorithm; can cause long pauses for large heaps. Example log:
Heap
def new generation total 4928K, used 1373K [0x27010000, 0x27560000, 0x2c560000)
eden space 4416K, 31% used [0x27010000,0x27167628,0x27460000)
from space 512K, 0% used [0x27460000,0x27460000,0x274e0000)
to space 512K, 0% used [0x274e0000,0x274e0000,0x27560000)
... (truncated for brevity)Parallel Collector
Parallelizes GC work across multiple threads, reducing pause time on multi‑core CPUs but may increase overall overhead.
CMS Collector
Concurrent Mark‑Sweep focuses on minimizing pause times by performing most work concurrently with application threads, using the mark‑sweep algorithm.
G1 Collector
Garbage‑First uses a region‑based approach with concurrent marking and compaction, aiming for predictable pause times and high throughput.
Impact of Collectors on System Performance
A benchmark allocating 512 × 100 bytes in a loop shows that different collectors and thread counts significantly affect total execution time.
Performance Test Code
import java.util.HashMap;
public class GCTimeTest {
static HashMap map = new HashMap();
public static void main(String[] args){
long begintime = System.currentTimeMillis();
for(int i=0;i<10000;i++){
if(map.size()*512/1024/1024>=400){
map.clear();
System.out.println("clean map");
}
byte[] b1;
for(int j=0;j<100;j++){
b1 = new byte[512];
map.put(System.nanoTime(), b1);
}
}
long endtime = System.currentTimeMillis();
System.out.println(endtime-begintime);
}
}Results with various GC options (e.g., -XX:+UseParNewGC, -XX:+UseParallelOldGC, -XX:+UseSerialGC, -XX:+UseConcMarkSweepGC) illustrate the trade‑offs between pause time and throughput.
Summary of GC‑Related JVM Parameters
Parameters for serial, parallel, CMS, and G1 collectors are listed, covering options such as -XX:+UseSerialGC, -XX:ParallelGCThreads, -XX:+UseG1GC, -XX:MaxGCPauseMills, -XX:GCTimeRatio, -XX:+UseAdaptiveSizePolicy, and others that control heap sizing, pause targets, and collection behavior.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.