Operations 21 min read

Diagnosing Java CPU, Memory, Disk, and Network Issues with JStack, JStat, and Linux Tools

This guide explains how to systematically investigate Java application problems across CPU, disk, memory, and network layers using Linux utilities and Java tools such as jstack, jstat, jmap, iostat, vmstat, and network commands, providing concrete commands and analysis steps to pinpoint root causes.

ITPUB
ITPUB
ITPUB
Diagnosing Java CPU, Memory, Disk, and Network Issues with JStack, JStat, and Linux Tools

Overview

Online incidents often involve CPU, disk, memory, and network problems, and many failures span multiple layers. A comprehensive investigation should check each of these four aspects in order, using tools like df, free, top, jstack, and jmap to gather relevant data.

CPU Diagnosis

Typical CPU issues stem from business logic errors (e.g., infinite loops), frequent garbage collection, or excessive context switches. Use ps to locate the process ID, then top -H -p <pid> to identify high‑CPU threads. Convert the thread ID to hexadecimal with

printf '%x
' <tid>

to obtain the NID, and retrieve the corresponding stack trace:

jstack <pid> | grep '<nid>' -C5 --color

Analyze the stack for WAITING or TIMED_WAITING states, and use the following command to get an overall view of thread states:

cat jstack.log | grep "java.lang.Thread.State" | sort -nr | uniq -c

Disk Diagnosis

Start with filesystem space using df -hl. For performance issues, examine I/O statistics with iostat -d -k -x, focusing on the %util, rrqm/s, and wrqm/s columns to locate the problematic disk. Identify the responsible process with iotop, then map the I/O thread ID (tid) to a PID via: readlink -f /proc/*/task/<tid>/../.. Finally, inspect the process’s I/O counters: cat /proc/<pid>/io and open files with lsof -p <pid>.

Memory Diagnosis

Memory problems include OOM, GC pauses, and off‑heap leaks. Begin with free to view overall memory usage. For OOM, distinguish between native thread stack exhaustion, Java heap exhaustion, and Metaspace exhaustion:

Native thread OOM:

Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread

. Reduce thread stack size with -Xss or raise OS limits in /etc/security/limits.conf.

Heap OOM: java.lang.OutOfMemoryError: Java heap space. Look for memory leaks using jstack and jmap, then increase -Xmx if necessary.

Metaspace OOM: java.lang.OutOfMemoryError: Metaspace. Adjust -XX:MaxMetaspaceSize or -XX:MaxPermSize (pre‑Java 8).

Use jmap -dump:format=b,file=heap.hprof <pid> to capture a heap dump and analyze it with Eclipse MAT ( jmap -histo:live <pid> for live object histogram). Enable automatic heap dumps on OOM with -XX:+HeapDumpOnOutOfMemoryError.

GC and Thread Issues

Frequent GC can be monitored with jstat -gc <pid> 1000, which reports generation statistics (e.g., S0C/S0U, EC/EU, YGC/YGT, FGC/FGCT). Excessive context switches are visible via vmstat (look at the cs column) or per‑process with pidstat -w <pid> (showing cswch and nvcswch).

For deeper analysis, enable Native Memory Tracking (NMT) in Java 7 U40+: -XX:NativeMemoryTracking=summary # or detail Set a baseline with jcmd <pid> VM.native_memory baseline and later compare using jcmd <pid> VM.native_memory detail.diff. Use gdb to dump suspicious native memory regions and inspect with hexdump -C.

Network Diagnosis

Network problems are diverse. Distinguish between connection timeout, read/write timeout, and TCP‑level issues such as queue overflows and RST packets. Use netstat -s | egrep "listen|LISTEN" and ss -lnt to view socket statistics, and check backlog settings ( somaxconn, tcp_max_syn_backlog).

Detect TCP queue overflows with: netstat -s | egrep "overflowed|sockets dropped" Identify RST bursts via tcpdump -i <iface> tcp -w capture.cap and analyze with Wireshark. Monitor TIME_WAIT and CLOSE_WAIT counts using:

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ss -ant | awk '{++S[$1]} END {for(a in S) print a, S[a]}'

Adjust kernel parameters to mitigate excessive TIME_WAIT ( net.ipv4.tcp_tw_reuse=1, net.ipv4.tcp_tw_recycle=1) and tune tcp_max_tw_buckets if needed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

troubleshootingjstackjstat
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.