JVM Garbage Collection Tuning: Reducing FullGC Frequency and Improving Throughput
This article documents a month-long JVM garbage‑collection tuning effort on a 2‑core, 4 GB server cluster, detailing initial problems with frequent FullGC, successive configuration adjustments, memory‑leak investigations, and the final optimizations that cut FullGC occurrences and significantly improved overall throughput.
Over the course of more than a month, the author reduced FullGC occurrences from about 40 times per day to roughly once every ten days and halved YoungGC time by systematically tuning JVM parameters and investigating memory leaks.
Initial Situation
The production servers (2 CPU, 4 GB RAM, four nodes) suffered from extremely frequent FullGC (≈40 per day) and occasional automatic restarts, indicating severe performance issues.
Key JVM startup parameters were:
-Xms1000M
-Xmx1800M
-Xmn350M
-Xss300K
-XX:+DisableExplicitGC
-XX:SurvivorRatio=4
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70
-XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128M
-XX:+UseFastAccessorMethods
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGCExplanation of some flags:
-Xmx1800M : sets maximum heap size to 1800 MB.
-Xms1000M : sets initial heap size; matching it to -Xmx avoids re‑allocation after each GC.
-Xmn350M : defines young generation size; increasing it can improve throughput.
-Xss300K : thread stack size; smaller values allow more threads.
First Optimization
Observing that the young generation was too small, the author increased it and aligned initial and maximum heap sizes:
-Xmn350M → -Xmn800M -XX:SurvivorRatio=4 → -XX:SurvivorRatio=8 -Xms1000M → -Xms1800M
After five days, YoungGC frequency dropped by more than half and its duration decreased by 400 s, but FullGC frequency rose by 41 times, making this optimization unsuccessful.
Second Optimization – Memory Leak Investigation
A large number of instances of a class T (≈10 000, ~20 MB) were found, caused by an anonymous inner‑class listener that retained references and prevented garbage collection.
public void doSmthing(T t) {
redis.addListener(new Listener(){
public void onTimeout(){
if(t.success()){
// execute operation
}
}
});
}Removing the listener and fixing related error logs reduced some leak but did not fully resolve the issue, and servers continued to restart.
Further Leak Diagnosis
Heap dumps revealed tens of thousands of ByteArrowRow objects, indicating excessive database query/insert activity. An unexpected traffic spike (≈83 MB/s) was observed, but after consulting the cloud provider it was deemed normal.
The root cause turned out to be a missing module condition in a database query, which fetched over 400 k rows unintentionally, overwhelming memory and causing the restarts.
Second Round of Tuning
After fixing the query, FullGC frequency dropped to five times over three days using the original parameters.
Further analysis showed FullGC occurring even when old‑generation usage was below 30 %. The metaspace size had grown to ~200 MB, prompting additional tuning:
-Xmn350M → -Xmn800M (prod1) / -Xmn600M (prod2) -Xms1000M → -Xms1800M -XX:MetaspaceSize=200M -XX:CMSInitiatingOccupancyFraction=75
Running these settings for ten days on two servers (prod1, prod2) showed a clear reduction in both FullGC and YoungGC compared with the unchanged servers (prod3, prod4). Prod1 achieved the best throughput, with fewer GC pauses and higher thread‑startup counts.
Conclusion
FullGC more than once per day is abnormal.
When FullGC spikes, first investigate memory leaks.
After fixing leaks, JVM tuning opportunities become limited.
High CPU usage should be cross‑checked with the cloud provider.
Unexpected traffic often originates from inefficient database queries.
Regularly monitor GC to detect issues early.
The month‑long tuning process demonstrated that careful parameter adjustment and thorough leak investigation can dramatically improve JVM performance and server stability.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.