Mastering Java Service Performance: Diagnose CPU, Memory, IO & Network Issues
This guide walks you through systematic troubleshooting of Java service performance problems—covering CPU spikes, memory leaks, GC pauses, disk I/O anomalies, and network bottlenecks—by explaining key metrics, command‑line tools, visual profilers, and practical code examples.
CPU Usage Diagnosis
Monitor CPU load with top -H -p <pid> to list threads. Identify the thread with the highest %CPU, note its Linux TID, and convert it to hexadecimal using printf '%x\n' <tid>. Search the Java stack trace with jstack <pid> | grep 'nid=0x<hex>' to locate the corresponding Java thread and examine its stack for the hot code path. Visual profilers such as JProfiler can display per‑thread CPU consumption; sustained high load usually points to application‑level problems such as tight loops, excessive GC, or thread contention.
On macOS the native thread IDs are shown directly by jstack , so the conversion step can be skipped.
Memory Tuning and OOM Diagnosis
Common memory issues are OutOfMemoryError, memory leaks, and GC‑induced latency. The young‑to‑old generation ratio can be tuned with -XX:NewRatio=<ratio> (default 1:2).
Example that deliberately triggers OOM:
public class OomTests {
static class OOMObject {
public byte[] placeholder = new byte[640 * 1024];
}
public static void add(int num) throws InterruptedException {
List<OOMObject> list = new ArrayList<>();
for (int i = 0; i < num; i++) {
Thread.sleep(500);
System.out.println(i);
list.add(new OOMObject());
}
System.gc();
}
public static void main(String[] args) throws InterruptedException {
add(1000);
}
}When the heap exceeds its limit the JVM throws java.lang.OutOfMemoryError. Capture a heap dump either manually: jmap -dump:live,format=b,file=heap.hprof <pid> or automatically by starting the JVM with -XX:+HeapDumpOnOutOfMemoryError Analyze the dump with a profiler (e.g., JProfiler) to find the largest retained objects and their reference chains.
Memory leaks occur when objects remain reachable after they should have been released (e.g., unreleased connections or streams). The investigation method mirrors OOM analysis: locate unexpectedly large retained objects in the heap dump.
G1 Garbage Collector
From JDK 9 onward G1 is the default collector. Enable it explicitly with -XX:+UseG1GC and add detailed logging:
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-Xloggc:/log/heapTest.log
-XX:+UseG1GCMonitor GC activity using jstat -gc <pid> 1000 10 (sample every second, 10 samples). Observe Young GC frequency, promotion to the old generation, and Full GC counts.
Disk and I/O Inspection
Check disk space with df -lh and I/O statistics with iostat. These commands give a quick view of storage utilization and throughput.
Network Troubleshooting
Typical network problems include TCP listen‑queue overflow and unreachable services.
Inspect socket statistics: netstat -s | egrep "listen|LISTEN". The "listenOverflow" counter shows how many times the accept queue was full.
Test connectivity to a host and port: telnet <host> <port> (or nc -zv <host> <port>). A successful connection confirms network reachability.
Summary
Effective performance troubleshooting combines command‑line metrics, JVM diagnostic tools ( top, jstack, jmap, jstat), and visual profilers (JProfiler). Adjust JVM flags for GC logging, tune memory ratios, and verify system resources (CPU, memory, disk, network) to isolate the root cause of latency or crashes.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
