Operations 7 min read

Diagnosing Linux CPU Spikes with top, Thread Dumps, and jstack

This guide walks through real‑world Linux performance troubleshooting, showing how to use top to pinpoint high‑CPU processes, convert thread IDs, capture multiple jstack thread dumps, and interpret key top metrics such as load average, task states, and memory usage.

ITPUB
ITPUB
ITPUB
Diagnosing Linux CPU Spikes with top, Thread Dumps, and jstack

Background

When a service that has been running smoothly suddenly shows a CPU spike, the first step is to identify the offending process. Using top reveals which PID is consuming most CPU, allowing deeper inspection of its threads.

Investigating with top

Run top -Hp <PID> to list threads of the high‑CPU process. In the example, PID 2816 showed high usage, and thread 2825 was the culprit.

top output showing high‑CPU PID
top output showing high‑CPU PID

To correlate thread IDs with Java thread dumps, convert the decimal thread ID to hexadecimal (e.g., using Python’s hex() function).

decimal to hexadecimal conversion
decimal to hexadecimal conversion

Thread Dump Analysis

Capture several jstack dumps for the same PID because thread states can change rapidly. The dumps reveal threads holding locks and those waiting, helping pinpoint why a lock is not released.

jstack thread dump showing lock contention
jstack thread dump showing lock contention

Deep Dive into top

The top command provides a wealth of information:

First line : system time vs. uptime; focus on uptime because frequent reboots can mask issues.

Number of logged‑in users (check with who or last).

Load averages (1‑, 5‑, 15‑minute) – compare against CPU core count to assess load severity.

top header fields
top header fields

Second line : total tasks and number of zombie processes – watch the zombie count.

Third line : CPU usage breakdown.

Key CPU columns:

US/SY : user vs. system CPU time.

NI : nice‑adjusted processes (should be low).

ID : idle CPU; WA indicates I/O wait, which spikes under heavy logging.

HI/SI : hardware vs. software interrupts.

ST : stolen time for virtualized environments.

CPU usage columns
CPU usage columns

Memory and Cache Details

Top’s fourth and fifth rows show buffer (data awaiting processing) and cache (cached results, e.g., from a database). Excessive swap usage indicates insufficient RAM.

Process list columns explained: PID, USER, PR, VIRT, RES, SHR. Note that RES is the actual resident memory; the true physical memory used by a process is RES‑SHR.

Conclusion

By combining top for real‑time metrics, converting thread IDs, and analyzing multiple jstack dumps, engineers can quickly isolate the root cause of CPU spikes, such as lock contention or runaway threads, and take corrective actions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringtopjstackthread-dump
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.