Operations 10 min read

JVM Garbage Collection Tuning: Reducing FullGC Frequency and Improving Throughput

This article documents a month-long JVM garbage‑collection tuning effort on a 2‑core, 4 GB server cluster, detailing initial problems with frequent FullGC, successive configuration adjustments, memory‑leak investigations, and the final optimizations that cut FullGC occurrences and significantly improved overall throughput.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
JVM Garbage Collection Tuning: Reducing FullGC Frequency and Improving Throughput

Over the course of more than a month, the author reduced FullGC occurrences from about 40 times per day to roughly once every ten days and halved YoungGC time by systematically tuning JVM parameters and investigating memory leaks.

Initial Situation

The production servers (2 CPU, 4 GB RAM, four nodes) suffered from extremely frequent FullGC (≈40 per day) and occasional automatic restarts, indicating severe performance issues.

Key JVM startup parameters were:

-Xms1000M
-Xmx1800M
-Xmn350M
-Xss300K
-XX:+DisableExplicitGC
-XX:SurvivorRatio=4
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70
-XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128M
-XX:+UseFastAccessorMethods
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC

Explanation of some flags:

-Xmx1800M : sets maximum heap size to 1800 MB.

-Xms1000M : sets initial heap size; matching it to -Xmx avoids re‑allocation after each GC.

-Xmn350M : defines young generation size; increasing it can improve throughput.

-Xss300K : thread stack size; smaller values allow more threads.

First Optimization

Observing that the young generation was too small, the author increased it and aligned initial and maximum heap sizes:

-Xmn350M → -Xmn800M -XX:SurvivorRatio=4 → -XX:SurvivorRatio=8 -Xms1000M → -Xms1800M

After five days, YoungGC frequency dropped by more than half and its duration decreased by 400 s, but FullGC frequency rose by 41 times, making this optimization unsuccessful.

Second Optimization – Memory Leak Investigation

A large number of instances of a class T (≈10 000, ~20 MB) were found, caused by an anonymous inner‑class listener that retained references and prevented garbage collection.

public void doSmthing(T t) {
    redis.addListener(new Listener(){
        public void onTimeout(){
            if(t.success()){
                // execute operation
            }
        }
    });
}

Removing the listener and fixing related error logs reduced some leak but did not fully resolve the issue, and servers continued to restart.

Further Leak Diagnosis

Heap dumps revealed tens of thousands of ByteArrowRow objects, indicating excessive database query/insert activity. An unexpected traffic spike (≈83 MB/s) was observed, but after consulting the cloud provider it was deemed normal.

The root cause turned out to be a missing module condition in a database query, which fetched over 400 k rows unintentionally, overwhelming memory and causing the restarts.

Second Round of Tuning

After fixing the query, FullGC frequency dropped to five times over three days using the original parameters.

Further analysis showed FullGC occurring even when old‑generation usage was below 30 %. The metaspace size had grown to ~200 MB, prompting additional tuning:

-Xmn350M → -Xmn800M (prod1) / -Xmn600M (prod2) -Xms1000M → -Xms1800M -XX:MetaspaceSize=200M -XX:CMSInitiatingOccupancyFraction=75

Running these settings for ten days on two servers (prod1, prod2) showed a clear reduction in both FullGC and YoungGC compared with the unchanged servers (prod3, prod4). Prod1 achieved the best throughput, with fewer GC pauses and higher thread‑startup counts.

Conclusion

FullGC more than once per day is abnormal.

When FullGC spikes, first investigate memory leaks.

After fixing leaks, JVM tuning opportunities become limited.

High CPU usage should be cross‑checked with the cloud provider.

Unexpected traffic often originates from inefficient database queries.

Regularly monitor GC to detect issues early.

The month‑long tuning process demonstrated that careful parameter adjustment and thorough leak investigation can dramatically improve JVM performance and server stability.

JavaJVMGarbage Collectionperformance tuningserver optimizationFullGC
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.