Operations 7 min read

Diagnosing Full GC and High CPU Issues in Java Services with Arthas, async‑profiler, VisualVM and GCEasy

This article demonstrates how to quickly locate and resolve frequent full GC and CPU spikes in Java backend services by combining Arthas‑integrated async‑profiler flame graphs with VisualVM, GCEasy analysis, and practical step‑by‑step deployment procedures.

JD Retail Technology

Mar 18, 2024

Diagnosing Full GC and High CPU Issues in Java Services with Arthas, async‑profiler, VisualVM and GCEasy

In daily operations, Java backend services often encounter full GC, CPU spikes, and memory pressure; rapid resolution requires pinpointing the problematic code. This article introduces a practical workflow using Alibaba's Arthas (which embeds async‑profiler flame‑graph), VisualVM, and GCEasy to diagnose such issues.

1. Background

In an order‑domain task system, both master and slave machines suffered frequent full GC and intermittent CPU spikes, with young GC occurring about ten times per minute and thread counts rising to ~1500. The application runs on a CMS collector with a 4 CPU 8 GB heap (4 GB allocated), while heap and non‑heap memory appear normal.

Images illustrate the GC frequency increase (from every 20 minutes to every 3‑5 minutes) and CPU usage exceeding 75% on some nodes.

2. Tool Selection and Practice

Four tools were evaluated: Arthas + async‑profiler flame graphs, VisualVM (cross‑time heap dump comparison), GCEasy, and traditional GC logs. Each was applied to the problematic environment.

2.1 Arthas analysis

Flame‑graph inspection revealed that deserialization of product‑domain channel configuration objects on the master node consumed significant CPU.

2.2 VisualVM analysis Two heap dump files (spanning a day) were compared. The largest memory consumers were ES query results; deserialized objects were not the biggest but were still identifiable. Differences arise because Arthas focuses on CPU/thread usage while heap dumps reflect memory. Commands used:

cd /Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/bin

jvisualvm

VisualVM configuration (visualvm.conf) was adjusted to increase memory allocation. 2.3 GCEasy analysis JVM was started with -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:./gc.log to capture GC logs. GCEasy online comparison showed no memory leak but did not pinpoint the GC cause. 2.4 async‑profiler flame‑graph Flame‑graph results highlighted two major hotspots: the reverse‑lookup task on the master/slave node and the ES query deserialization, matching Arthas findings. Downloading the memory flame‑graph confirmed these hotspots and allowed cross‑validation with VisualVM dumps. 3. Fix and Release Based on the analysis, code modifications were made to address the identified bottlenecks, and the updated version was deployed. Post‑deployment monitoring showed stable system behavior. 4. Usage Steps 1) Request a bastion host with root access. 2) Download the latest Arthas boot JAR: wget https://alibaba.github.io/arthas/arthas-boot.jar . 3) Install with admin rights: java -jar arthas-boot.jar . 4) Use commands to view CPU usage, dashboard (q to quit), and thread information. 5) Start profiling: profiler start to begin sampling. 6) Generate flame‑graph: profiler stop --format html and download the result. 7) Explore multiple dimensions (lock, allocation, CPU) via the generated HTML flame‑graph. 8) Additional features include decompiling JAR code and measuring method execution times. The article concludes that integrating Arthas with async‑profiler provides the most intuitive and efficient way to locate performance problems, and encourages colleagues to explore these tools for similar scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

java gc Arthas CPU profiling Performance debugging async-profiler VisualVM

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.