Operations 10 min read

Cutting Full GC from 40× Daily to Once Every 10 Days: JVM Tuning Insights

Over a month, we reduced Full GC occurrences on a 2‑core, 4 GB JVM cluster from roughly 40 times per day to once every ten days, while halving Young GC duration, by adjusting heap parameters, fixing memory leaks, and tuning metaspace, ultimately improving server throughput and stability.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
Cutting Full GC from 40× Daily to Once Every 10 Days: JVM Tuning Insights

During more than a month of effort, we optimized a JVM cluster (2 CPU, 4 GB RAM, four servers) to lower Full GC frequency from about 40 times per day to roughly once every ten days, and cut Young GC time by more than half.

Initial Situation

The servers experienced frequent Full GC (average >40 times per day) causing automatic restarts, indicating severe instability.

Key JVM startup parameters were:

-Xms1000M -Xmx1800M -Xmn350M -Xss300K -XX:+DisableExplicitGC -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:LargePageSizeInBytes=128M -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC

-Xmx1800M: maximum heap size.

-Xms1000M: initial heap size (should match -Xmx to avoid reallocation).

-Xmn350M: young generation size (recommended 3/8 of total heap).

-Xss300K: thread stack size.

First Optimization

We increased the young generation and aligned initial and maximum heap sizes:

-Xmn350M -> -Xmn800M
-XX:SurvivorRatio=4 -> -XX:SurvivorRatio=8
-Xms1000M -> -Xms1800M

After deploying to two servers for five days, Young GC frequency dropped by more than half and its duration decreased by 400 s, but Full GC frequency rose dramatically, indicating the first attempt failed.

Second Optimization – Memory Leak Investigation

A large number of instances of object T (≈20 MB) were retained due to an anonymous inner class listener that was never released after a timeout, causing a memory leak.

public void doSmthing(T t) {
    redis.addListener(new Listener(){
        public void onTimeout(){
            if(t.success()){
                // execute operation
            }
        }
    });
}

Removing the leak reduced the number of objects but did not fully resolve the issue; the servers still restarted.

Further Leak Detection

Heap dumps revealed tens of thousands of ByteArrayRow objects, traced to a missing module condition in a database query that fetched over 400 k rows, generating massive traffic (≈83 MB/s) despite no actual load.

Fixing the query eliminated the abnormal traffic and, after three days with the original JVM parameters, Full GC occurred only five times.

Second Tuning Phase

With the leak resolved, we focused on metaspace size, which had grown to ~200 MB (default 21 MB) and triggered Full GC. Adjusted parameters for two servers (prod1, prod2) were:

-Xmn800M
-Xms1800M
-XX:MetaspaceSize=200M
-XX:CMSInitiatingOccupancyFraction=75

For the other two servers (prod3, prod4) we kept the original settings. After ten days, prod1 and prod2 showed significantly lower Full GC counts and Young GC frequency compared to prod3 and prod4, and overall throughput improved.

Summary

Full GC more than once per day indicates a serious problem.

When Full GC spikes, prioritize investigating memory leaks.

After fixing leaks, JVM tuning opportunities are limited; invest time wisely.

Persistent high CPU should be checked with the cloud provider after ruling out code issues.

Unexpected high inbound traffic may stem from database queries; verify query conditions.

Regularly monitor GC to detect issues early.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaJVMGarbage Collectionperformance tuningmemory leakserver operations
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.