Diagnosing Full GC and Memory Leak Issues in a Java Backend Application
This article details a step‑by‑step investigation of frequent full GC events and high CPU usage in a Java backend service, covering memory analysis with SGM, heap dumps, identification of large Base64 image strings, static encryption utilities, and remediation actions such as heap size increase and custom encryption handling.
Introduction
The team received an intelligent alert on WeChat indicating that a financial application was experiencing frequent full GC events. The initial observation of SGM and online machine metrics prompted a deeper investigation into memory and CPU usage.
Alert Details
Full GC alerts were treated as critical warnings, prompting immediate analysis of the JVM’s old and permanent generation spaces.
Investigation Steps
1. top -H -p 7975 was used to identify the process with high CPU usage (PID 7975). 2. The highest‑CPU thread was located using top -H -p [process_id] and converted to hexadecimal with printf "%x\n" [thread_id] . 3. The corresponding thread stack was examined via jstack [process_id] | grep -A 10 [hex_thread_id] , revealing that a CMS garbage‑collection thread was consuming CPU due to continuous memory reclamation attempts.
Memory analysis using SGM showed that old generation space was not exhausted, suggesting other causes for full GC. Heap dumps were exported and examined with IBM Heap Analyzer, pinpointing the EnterRealNameApplyUploadImgReqModel class, which held large Base64‑encoded image strings.
Findings
The root cause was identified as oversized image uploads (up to 3 MB) combined with a static encryption/decryption utility that retained references to the large strings, preventing garbage collection and causing old‑gen growth and frequent full GC.
Remediation
• Increased the JVM heap from 2 GB to 4 GB on the 8 GB production server. • Modified the image‑upload API to bypass the generic encryption flow, handling Base64 data without encryption and limiting image size to 2 MB. • After changes, VisualVM monitoring showed a significant reduction in old‑gen growth rate and GC frequency.
CPU Analysis
Even after fixing the memory leak, the server’s CPU remained above 10 %. Tracing the high‑CPU thread confirmed it was the CMS GC thread, which kept running because the heap remained under pressure.
Conclusion
JVM issues such as full GC and high CPU often stem from memory leaks, large object allocations, and inefficient encryption utilities. A systematic approach—checking JVM and system metrics, analyzing heap dumps, identifying problematic objects, and adjusting heap size or code paths—helps resolve these performance problems.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.