Operations 19 min read

Comprehensive Guide to Java Online Fault Diagnosis: CPU, Disk, Memory, GC, and Network Issues

This article provides a detailed, step‑by‑step methodology for diagnosing and resolving common Java production problems—including CPU spikes, disk bottlenecks, memory leaks, garbage‑collection anomalies, and network timeouts—by leveraging native Linux tools and JVM utilities such as ps, top, jstack, jmap, jstat, iostat, vmstat, pidstat, and netstat.

Java Captain
Java Captain
Java Captain
Comprehensive Guide to Java Online Fault Diagnosis: CPU, Disk, Memory, GC, and Network Issues

CPU

Typical CPU issues stem from business logic loops, frequent GC, or excessive context switches; they are usually identified with ps to find the PID, top -H -p pid to locate hot threads, converting the thread ID to hex with printf '%x\n' pid, and then examining the stack via jstack pid | grep 'nid' -C5 --color. Analyzing jstack.log for WAITING/TIMED_WAITING states helps pinpoint problematic threads.

Frequent GC

Use jstat -gc pid 1000 to monitor GC frequency and pause times; high YGC/FGC counts indicate GC‑related performance degradation.

Context Switches

Inspect vmstat for the cs column, and for specific processes use pidstat -w pid to view voluntary and involuntary switches.

Disk

Check filesystem space with df -hl. For performance, run iostat -d -k -x and focus on the %util, rrqm/s, and wrqm/s columns to locate saturated disks. Identify the offending process via iotop, translate TIDs to PIDs with readlink -f /proc/*/task/tid/../.., then examine I/O details using cat /proc/pid/io and lsof -p pid.

Memory

Memory problems include OOM, GC pressure, and off‑heap leaks. Start with free to view overall usage. For heap OOM, check logs like java.lang.OutOfMemoryError: Java heap space and use jstack / jmap to locate leaks, adjusting -Xmx only after code‑level fixes. Metaspace OOM can be mitigated by increasing XX:MaxMetaspaceSize. StackOverflow errors relate to -Xss size.

Using JMAP for Heap Leak Detection

Generate a heap dump with jmap -dump:format=b,file=filename pid and analyze it in MAT (Eclipse Memory Analyzer) via Leak Suspects or Top Consumers.

GC Issues

Enable detailed GC logging with

-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

. For G1, monitor young GC frequency ( -Xmn, -XX:SurvivorRatio) and full GC triggers (e.g., insufficient heap, large object allocation failures). Use jinfo to dump heap before/after full GC for comparison.

Network

Network problems cover timeouts, TCP queue overflows, RST packets, TIME_WAIT, and CLOSE_WAIT states. Distinguish connection vs. read/write timeouts and keep client timeouts lower than server limits. Diagnose TCP queue overflows with netstat -s and ss -lnt, adjusting acceptCount (Tomcat) or acceptQueueSize (Jetty) as needed.

RST Packets

RSTs arise from port mismatches, abrupt terminations, or stale packets. Capture them with tcpdump -i en0 tcp -w xxx.cap and analyze in Wireshark.

TIME_WAIT and CLOSE_WAIT

Monitor these states via netstat -n or ss -ant. Reduce excessive TIME_WAIT with kernel parameters net.ipv4.tcp_tw_reuse=1 and net.ipv4.tcp_tw_recycle=1. CLOSE_WAIT often indicates application‑level socket handling bugs; investigate with jstack to find threads stuck in I/O waits.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaperformancenetworktroubleshootingCPUMemorygc
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.