Java Performance Tuning in Practice (Part 6): End-to-End Case Study from Full GC to 200 ms Response

The article walks through a real e‑commerce order service where afternoon spikes caused response times to jump to 2‑3 seconds, Full GC to run every few minutes, and Metaspace to hit 99.9%, then demonstrates step‑by‑step analysis using monitoring, GC logs, jstat, jmap, and heap dumps, fixes a custom class‑loader leak, adjusts JVM flags, and achieves sub‑200 ms latency with no Full GC.

Coder Trainee
Coder Trainee
Coder Trainee
Java Performance Tuning in Practice (Part 6): End-to-End Case Study from Full GC to 200 ms Response

Case Background

An e‑commerce order service (four instances, each 4 CPU × 8 GB) sits behind a gateway and uses MySQL and Redis.

Problem Symptoms

Afternoon peak (3‑5 pm) latency rises from ~50 ms to 2‑3 seconds.

Some requests time out, increasing user complaints.

Monitoring alerts show CPU > 80 % and frequent Full GC.

Step 1 – Monitoring Signals

Key metrics compared with normal thresholds:

P99 response time: normal < 100 ms, observed 2.3 s.

GC pause: normal < 50 ms, observed 800 ms‑2 s.

Full GC frequency: normal < 1 time/day, observed 5‑8 times/hour.

Old‑generation usage: normal < 70 %, observed > 95 %.

Step 2 – GC Log Inspection

# 查看 GC 日志
cat gc.log | grep "Full GC"
2025-06-15T14:30:01.123+0800: 1234.567: [Full GC (Metadata GC Threshold)
  2025-06-15T14:30:01.123+0800: 1234.567: [CMS: 8192K->8192K(20480K), 0.8567 secs]
  15360K->8192K(26368K), [Metaspace: 1024K->1024K(2048K)], 0.8570 secs]
[Times: user=0.85 sys=0.01, real=0.86 secs]

Key observations:

Full GC triggered by Metadata GC Threshold (Metaspace shortage).

Each Full GC takes 0.85‑0.91 s.

Heap memory is reclaimed normally, but Metaspace stays at 1024 KB.

Step 3 – jstat Verification

jstat -gcutil 12345 1000

Sample output shows Metaspace (M) usage at 99.9 % while the old generation (O) is only ~12 %.

Step 4 – Class‑Loader Leak Detection

jmap -clstats 12345

Output snippet:

class_loader    classes   bytes   parent_loader   alive?   type
0x00000007f3a4c00 1234 2,456,789 ... true  sun.misc.Launcher$AppClassLoader
...

Many class_loader entries are marked alive? false, indicating they were not reclaimed; this is a classic class‑loader leak.

Root Cause

The application creates a custom class loader to load configuration classes dynamically, but the loader is never closed after the task finishes, preventing the loaded classes from being garbage‑collected and causing Metaspace to fill up, which in turn triggers frequent Full GC.

Step 5 – Heap Dump Confirmation

# Export heap dump
jmap -dump:format=b,file=heap.hprof 12345

Problem Suspect 1:
The class "com.example.DynamicConfig" is loaded by
"com.example.CustomClassLoader@0x7f3a4c00" which is not GC‑eligible.
It occupies 245 MB (78.5 %) of the heap.

The heap analysis (MAT) confirms the massive number of DynamicConfig instances held by the non‑GC‑eligible custom class loader.

Solution

Code Fix

// ❌ Problematic code
public class ConfigLoader {
    public void loadConfig(String path) {
        URLClassLoader loader = new URLClassLoader(new URL[]{new URL(path)});
        // load config classes …
        // loader never closed
    }
}

// ✅ Fixed code
public class ConfigLoader {
    public void loadConfig(String path) {
        try (URLClassLoader loader = new URLClassLoader(new URL[]{new URL(path)})) {
            // load config classes …
        } catch (IOException e) {
            // handle exception
        }
    }
}

JVM Parameter Adjustments

# Before tuning
-Xms4g -Xmx4g -XX:+UseConcMarkSweepGC

# After tuning
-Xms8g -Xmx8g               # increase heap size
-XX:+UseG1GC                # switch to G1
-XX:MaxGCPauseMillis=100    # target pause ≤ 100 ms
-XX:MetaspaceSize=256m      # initial Metaspace
-XX:MaxMetaspaceSize=512m    # max Metaspace with headroom
-XX:ConcGCThreads=4        # 4 concurrent GC threads

Resulting Improvements

Full GC frequency: 5‑8 times/hour → 0 times.

P99 response time: 2.3 s → 186 ms (≈ 92 % reduction).

CPU usage: > 80 % → 45‑55 %.

Metaspace usage: 99.9 % → 45‑60 %.

Full Tuning Process Recap

1. Monitoring alert → problem discovered
2. GC log → frequent Full GC identified
3. jstat → Metaspace near full
4. jmap -clstats → class‑loader leak detected
5. MAT heap dump → pinpoint custom class loader
6. Fix code + adjust JVM flags
7. Gradual rollout → full production deployment

The series concludes with a summary of the six parts covering JVM memory, CPU spikes, memory leaks, thread‑pool tuning, and GC optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaJVMPerformance TuningMetaspaceGCClassloader Leak
Coder Trainee
Written by

Coder Trainee

Experienced in Java and Python, we share and learn together. For submissions or collaborations, DM us.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.