Understanding the JVM Memory Model, Garbage Collection Algorithms, and Performance Optimizations
This comprehensive guide explains the JVM memory layout, object allocation, garbage collection mechanisms such as mark‑sweep, copying, and mark‑compact, details GC roots, reference types, collector types, tuning parameters, and related Java performance tools, providing practical code examples and diagrams.
1. Brief Overview of the JVM Memory Model
The JVM memory is divided into three main areas: the thread‑shared region, the thread‑private region, and the direct memory region.
1.1 Thread‑Shared Region
1.1.1 Heap
The heap is the largest memory area where most object instances are allocated. It is split into the young generation (Eden, S0, S1) with a default ratio of 8:1:1.
1.1.2 Metaspace
Formerly called the permanent generation, metaspace stores class metadata, constant pool, static variables, and JIT‑compiled code. Unlike the old permanent generation, metaspace resides in native memory and its size is limited only by the host OS unless explicitly bounded.
Metaspace essentially implements the JVM specification's method area, but it lives outside the virtual machine heap, so its default size is only constrained by native memory.
1.2 Direct Memory Region
Direct memory is allocated via native malloc calls (often through NIO) and is not part of the JVM heap. It can improve I/O performance because it avoids copying between Java heap and native buffers, though allocation and deallocation are slower.
1.3 Thread‑Private Region
Includes the program counter, the JVM stack, and the native method stack, each with the same lifecycle as the thread.
1.3.1 Program Counter
The program counter records the address of the next bytecode instruction for each thread, ensuring correct resumption after a context switch.
1.3.2 JVM Stack
Each method call creates a stack frame containing a local variable table, operand stack, frame data, and dynamic linking information.
public class ShowByteCode {
private String xx;
private static final int TEST = 1;
public ShowByteCode() {}
public int calc() {
int a = 100;
int b = 200;
int c = 300;
return (a + b) * c;
}
}1.3.3 Native Method Stack
Used exclusively for native method calls.
2. Determining Object Liveness
When the heap is insufficient, garbage collection (GC) is triggered. Two main techniques are used to decide whether an object is alive:
2.1 Reference Counting
Each object maintains a counter that increments on each reference and decrements when a reference is cleared. An object is considered dead when its count reaches zero.
Pros: simple and fast. Cons: cannot handle cyclic references.
class GcObject {
public Object instance = null;
}
public class GcDemo {
public static void main(String[] args) {
GcObject object1 = new GcObject(); // step 1
GcObject object2 = new GcObject(); // step 2
object1.instance = object2; // step 3
object2.instance = object1; // step 4
object1 = null; // step 5
object2 = null; // step 6
}
}2.2 Reachability Analysis
Most modern languages, including Java, use reachability analysis. Starting from a set of GC roots (e.g., stack references, static fields, JNI handles), the algorithm traverses object graphs; objects not reachable from any root are considered garbage.
3. Garbage‑Collection Algorithms
3.1 Mark‑Sweep
Marks all reachable objects, then sweeps away unmarked ones. It can create memory fragmentation.
3.2 Copying (Young‑Generation) Algorithm
Divides the heap into two equal spaces; live objects are copied to the other space, and the original space is cleared.
3.3 Mark‑Compact
After marking, live objects are moved to one end of the heap, eliminating fragmentation.
Metric
Mark‑Sweep
Mark‑Compact
Copying
Speed
Medium
Slow
Fast
Space Overhead
Low (but fragments)
Low (no fragments)
~2× live data
Object Movement
No
Yes
Yes
3.4 Three‑Color Marking and Write/Read Barriers
Modern collectors (e.g., CMS, G1) use a tri‑color marking scheme (white, gray, black) to handle concurrent marking. Write barriers prevent "floating garbage" and "missed marks" during concurrent phases.
4. GC Process Overview
Different generations use different algorithms: the young generation typically uses the copying algorithm, while the old generation may use mark‑compact or mark‑sweep.
4.1 Young Generation
Divided into Eden, Survivor‑From, and Survivor‑To. Objects are allocated in Eden; after a minor GC, surviving objects move to Survivor‑From or directly to the old generation if they exceed the tenuring threshold.
4.2 Old Generation
Holds long‑lived objects. Full GC (major GC) collects both young and old generations and usually employs mark‑sweep or mark‑compact.
4.3 Metaspace
Replaces the permanent generation. Its size is controlled by -XX:MetaspaceSize and -XX:MaxMetaspaceSize.
5. Garbage Collectors
5.1 Collector Overview
JDK 7/8 default: Parallel Scavenge (young) + Parallel Old (old). JDK 9 default: G1. Common server‑side combo: ParNew + CMS.
5.2 Young‑Generation Collectors
Name
Type
Algorithm
Use‑Case
Can Pair with CMS
Serial
Serial
Copying
Single‑CPU client
Yes
ParNew
Parallel (Serial’s parallel version)
Copying
Multi‑CPU server
Yes
Parallel Scavenge
Parallel
Copying
Multi‑CPU, throughput‑oriented
No
5.3 Old‑Generation Collectors
Name
Type
Algorithm
Use‑Case
Young‑Gen Pair
Serial Old
Serial
Mark‑Compact
Single‑CPU
Serial, ParNew, Parallel Scavenge
Parallel Old
Parallel
Mark‑Compact
Multi‑CPU
Parallel Scavenge
CMS
Concurrent
Mark‑Sweep
Multi‑CPU, low‑pause server
Serial, ParNew
5.3.1 CMS Details
CMS aims to minimize stop‑the‑world pauses by performing most work concurrently. Its phases are:
Initial Mark (STW)
Concurrent Mark
Final Remark (STW)
Concurrent Sweep
Pros: low pause times. Cons: higher CPU usage, cannot collect floating garbage, may cause fragmentation.
5.4 G1 Collector
G1 divides the heap into many equal‑sized regions (1‑32 MiB). It uses a combination of parallel, concurrent, and incremental techniques, providing predictable pause times.
Region types: Eden, Survivor, Old, Humongous. G1 maintains Remembered Sets (RSet) to track cross‑region references and selects a Collection Set (CSet) each cycle for reclamation.
5.4.1 Young (Yong) GC in G1
All young‑generation regions form the CSet and are reclaimed using a parallel copying algorithm.
5.4.2 Mixed GC in G1
Combines young regions with a subset of old regions that promise the highest reclamation benefit, still using parallel copying while keeping pause times within user‑specified limits.
6. Object Creation and Lifecycle
6.1 Class Lifecycle
Loading – bytecode is read by a class loader.
Verification – ensures the bytecode conforms to JVM specifications.
Preparation – static fields are allocated and set to default values.
Resolution – symbolic references are replaced with direct references.
Initialization – static initializers and static blocks run.
Usage – instances are allocated, fields are default‑initialized, and constructors execute.
Unloading – class data is reclaimed when no longer referenced.
6.2 Object Size
Typical new Object() occupies 16 bytes (8‑byte mark word + 4‑byte compressed class pointer + 4‑byte padding). A simple User object with an int and a String reference occupies 24 bytes.
6.3 Object Access Methods
HotSpot uses direct pointers for faster access, while handle‑based access provides stability during object relocation.
7. Stack Allocation via Escape Analysis
Escape analysis determines whether an object escapes the current method or thread. If it does not, the JVM can perform scalar replacement and allocate the object on the stack, reducing heap pressure.
8. Class Loaders and the Parent‑Delegation Model
Key methods: loadClass(), findClass(), and defineClass(). The standard hierarchy consists of Bootstrap, Extension, Application, and custom loaders.
8.1 Parent‑Delegation
Ensures classes are loaded only once and protects core Java classes from being overridden.
8.2 Alternative Loading Strategies
Containers like Tomcat may reverse the delegation order, and frameworks use thread‑context class loaders (SPI) to resolve implementation classes.
9. Out‑of‑Memory (OOM) and CPU‑100% Diagnosis
9.1 OOM Causes
Either the JVM heap is too small, or the application leaks memory by retaining references.
9.1.1 Types of OOM
Heap OOM – often due to memory leaks.
Stack OOM – caused by deep recursion or too many threads.
Metaspace OOM – excessive class metadata (e.g., many CGLib proxies).
9.1.2 Diagnostic Commands
jps # list Java processes
jstat -gcutil <pid> <interval> <count> # GC utilization
jmap -histo <pid> | more # heap histogram
jstack <pid> | grep <hex_tid> # thread dump for a specific thread9.2 CPU 100% Investigation
Identify the Java process ID (e.g., ps -ef | grep java).
Find the thread consuming CPU: top -Hp <pid>.
Convert the thread ID to hexadecimal: printf '%x\n' <tid>.
Locate the thread in a Java stack dump: jstack <pid> | grep -A 20 <hex_tid>.
10. GC Tuning Guidelines
Typical tuning starts with setting -Xms and -Xmx to the same value to avoid heap resizing. Adjust -Xmn for young‑gen size, -XX:SurvivorRatio for Eden/Survivor ratios, and enable GC logging ( -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log). Use -XX:+HeapDumpOnOutOfMemoryError to capture heap dumps on OOM.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
