Backend Development 10 min read

Java GC Optimization Case Study: Reducing GC Frequency and CPU Load

This article details a Java GC optimization performed on a nightly batch service, describing the original high GC and CPU load issues, the JVM parameter adjustments, memory re‑allocation, and the resulting significant reductions in GC frequency, CPU usage, and request latency.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Java GC Optimization Case Study: Reducing GC Frequency and CPU Load

Background

The A service runs a nightly batch job on a Java 8 server (ParNew + CMS collector) with an 8‑core, 16 GB machine. Frequent GC warnings and occasional CPU load spikes (>60%) were observed, prompting a focus on lowering CPU usage and GC frequency.

Configuration and Load

Java version: java8

GC collector: ParNew + CMS

Hardware: 8 CPU, 16 GB RAM, CentOS 6.8

Peak CPU load often exceeds 50% (historically >70% leads to rapid performance degradation)

Pre‑Optimization GC Situation

Young GC frequency reached 70 times/min with an average pause of 125 ms; Full GC occurred every 3 minutes with an average pause of 610 ms.

GC Parameters and JVM Settings

Parameter

Description

-Xmx6g -Xms6g

Heap size 6 GB

-XX:NewRatio=4

Old generation is 4× the young generation (4.8 GB old, 1.2 GB young)

-XX:SurvivorRatio=8

Eden:Survivor:Survivor = 8:1:1 (Eden ≈0.96 GB, each Survivor ≈0.12 GB)

-XX:ParallelCMSThreads=4

CMS uses 4 parallel threads

-XX:CMSInitiatingOccupancyFraction=72

Trigger CMS when old generation reaches 72% usage

-XX:+UseParNewGC

Enable ParNew for young generation

-XX:+UseConcMarkSweepGC

Enable CMS for old generation

Problem Analysis

2.1 Adding GC Logging Parameters

Insufficient GC logs led to adding the following flags to obtain detailed information:

-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintCommandLineFlags
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintReferenceGC

2.2 Premature Promotion

Log entries such as Desired survivor size 61054720 bytes, new threshold 2 (max 15) revealed that the JVM dynamically lowers the tenuring threshold when Survivor space fills, causing objects to be promoted to the old generation earlier than the default MaxTenuringThreshold of 15.

When the total size of objects of a certain age in Survivor exceeds half of Survivor space, objects of that age and older are promoted directly to the old generation, bypassing the fixed threshold.

This early promotion dramatically increased old‑generation growth.

2.3 Rapid Old‑Generation Growth

Monitoring showed the old generation’s memory increase matched Survivor’s growth, confirming premature promotions. Frequent early promotions (e.g., 100 MB per minute) at 15+ young GCs per minute could add 1.5 GB to the old generation each minute, leading to Full GC within 2–3 minutes.

2.4 Insufficient Young‑Generation Memory

The original young generation (1.2 GB total, 120 MB Survivor) was too small, causing the observed promotions.

Optimization Measures

Memory Re‑allocation

Increase heap to 10 GB: -Xmx10g -Xms10g -Xmn6g

Keep SurvivorRatio=8 (Eden:Survivor:Survivor = 8:1:1)

Heap grew from 6 GB to 10 GB.

Young generation enlarged from 1.2 GB to 6 GB.

Eden increased from 0.96 GB to 4.8 GB.

Survivor space increased from 120 MB to 600 MB.

Optimization Results

3.1 GC Frequency Reduction

Young GC dropped from 70 times/min to 12 times/min (≈83% reduction).

Full GC decreased from once every 3 minutes to roughly once per day.

Average pause times for both young and full GC remained unchanged.

3.2 CPU Load Decrease

Peak CPU load fell from >50% to <30%, a 40% reduction during peaks.

Daily average CPU load dropped from 29% to 20% (≈32% reduction).

3.3 Core Interface Performance Improvement

Interface A: 100 TPS peak, 999‑percentile latency reduced from 200 ms to 150 ms (≈25% improvement).

Interface B: 250 TPS peak, 999‑percentile latency reduced from 190 ms to 120 ms (≈37%); 9999‑percentile from 450 ms to 150 ms (≈67%).

Low‑peak latency for Interface B dropped from 80 ms to 10 ms (≈90% reduction).

Further minor JVM tweaks yielded negligible additional gains.

Conclusion

Key takeaways from this GC optimization:

Analyze GC logs to identify bottlenecks.

Adjust JVM memory settings to provide sufficient young‑generation space.

Effective GC tuning reduces request latency and improves service availability.

Lower GC activity translates to significant CPU load reductions and better hardware utilization.

When facing high latency or CPU load, consider examining GC behavior for potential optimization opportunities.

backendJavaperformance optimizationGarbage CollectionJVM TuningCPU Load
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.