Investigating a Sudden CPU Spike Caused by Excessive GC in a Containerized Java Application
The article details a production incident where a containerized Java service experienced a CPU surge due to frequent young and full garbage collections, describing step‑by‑step diagnostics using Linux tools, jstack analysis, and code fixes that ultimately resolved the issue.
During a Thursday afternoon, an alert indicated that a container’s CPU usage jumped above 90%, and JVM monitoring showed one pod performing 61 young GCs and a full GC within two hours, prompting an urgent investigation.
The investigator entered the affected pod and ran top to view process resource usage, identified the Java process (PID 1) with unusually high CPU, and then executed top -H -p pid to locate the thread (tid) consuming the most CPU.
After noting the thread ID (e.g., 746), the ID was converted to hexadecimal with printf "%x\n" 746, and the stack trace for that thread was extracted using jstack pid | grep 2ea > gc.stack. Because the file was large, it was served via a temporary Python HTTP server and downloaded with curl -o http://<i>ip</i>/gcInfo.stack.
Analyzing the stack revealed that the problem originated in an Excel export feature that reused a shared list query interface; the interface returned at most 200 records per page, but the export attempted to process tens of thousands of records, causing nested loops and massive list allocations that triggered repeated garbage collections.
The code was corrected to avoid the shared list misuse, the fix was deployed urgently, and the CPU and GC metrics returned to normal, confirming the resolution.
The post concludes by emphasizing a calm, layered troubleshooting approach for production issues and recommends tools like Arthas to simplify JVM diagnostics.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
