10 Essential Linux Commands to Diagnose Performance Issues in One Minute
When a Linux server’s load spikes, you can quickly pinpoint CPU, memory, disk I/O, and network bottlenecks by running ten concise commands—uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top—each providing specific metrics for rapid troubleshooting.
When a Linux server suddenly experiences a load surge, rapid diagnosis is essential to avoid prolonged outages. Netflix’s performance engineering team outlines ten commands that together give a comprehensive view of system health within about a minute.
Command list
uptime dmesg | tail vmstat 1 mpstat -P ALL 1 pidstat 1 iostat -xz 1 free -m sar -n DEV 1 sar -n TCP,ETCP 1 topUptime
$ uptime
23:51:26 up 21:31, 1 user, load average: 30.02, 26.43, 19.02The three load‑average numbers represent the average number of processes waiting for CPU over the last 1, 5, and 15 minutes. A high 1‑minute value compared with a low 15‑minute value indicates a recent spike that warrants deeper investigation.
dmesg | tail
$ dmesg | tail
[1880957.563150] perl invoked oom‑killer: gfp_mask=0x280da, order=0, oom_score_adj=0
[1880957.563400] Out of memory: Kill process 18694 (perl) score 246 or sacrifice child
[2320864.954447] TCP: Possible SYN flooding on port 7001. Dropping request.Shows the last kernel messages, useful for spotting OOM kills, hardware errors, or network anomalies.
vmstat 1
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
34 0 0 200889792 73708 591828 0 0 0 5 6 10 96 1 3 0 0
...Key columns:
r : processes waiting for CPU (if > CPU cores, CPU is saturated).
free : available memory in KB.
si/so : swap I/O (non‑zero indicates swapping).
us, sy, id, wa, st : CPU time spent in user, system, idle, I/O wait, and stolen.
These metrics help identify CPU saturation, memory pressure, and I/O wait.
mpstat -P ALL 1
$ mpstat -P ALL 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:38:49 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
07:38:50 PM all 98.47 0.00 0.75 0.00 0.00 0.00 0.00 0.00 0.00 0.78
...Displays per‑CPU utilization. A single CPU with a high usage percentage often points to a single‑threaded workload consuming most cycles.
pidstat 1
$ pidstat 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:41:02 PM UID PID %usr %system %guest %CPU CPU Command
07:41:03 PM 0 9 0.00 0.94 0.00 0.94 1 rcuos/0
07:41:03 PM 0 4214 5.66 5.66 0.00 11.32 15 mesos‑slave
07:41:03 PM 0 6521 1596.23 1.89 0.00 1598.11 27 java
...Shows CPU usage per process. In the example, two Java processes consume roughly 1600% CPU, meaning they are using about 16 cores.
iostat -xz 1
$ iostat -xz 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
73.96 0.00 3.73 0.03 0.06 22.21
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq‑sz avgqu‑sz await r_await w_await svctm %util
xvda 0.00 0.23 0.21 0.18 4.52 2.08 34.37 0.00 9.98 13.80 5.42 2.44 0.09
...Key fields:
r/s, w/s, rkB/s, wkB/s : read/write operations and throughput.
await : average I/O wait time (ms).
avgqu‑sz : average queue length; >1 suggests device saturation.
%util : device utilization; values above 60% may impact performance, 100% means fully saturated.
These help locate disk I/O bottlenecks.
free -m
$ free -m
total used free shared buffers cached
Mem: 245998 24545 221453 83 59 541
-/+ buffers/cache: 23944 222053
Swap: 0 0 0Shows memory usage in megabytes. The “-/+ buffers/cache” line reflects memory actually available to applications, as Linux uses free RAM for caching.
sar -n DEV 1
$ sar -n DEV 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
12:16:48 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:16:49 AM eth0 18763.00 5032.00 20686.42 478.30 0.00 0.00 0.00 0.00
...Provides per‑interface network throughput. In the example, eth0 handles ~22 MB/s, well below a 1 Gbps NIC’s capacity.
sar -n TCP,ETCP 1
$ sar -n TCP,ETCP 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
12:17:19 AM active/s passive/s iseg/s oseg/s
12:17:20 AM 1.00 0.00 10233.00 18846.00
...Shows TCP connection statistics: active/s (outgoing connections), passive/s (incoming connections), and retrans/s (retransmissions). High values can indicate network saturation or packet loss.
top
$ top
top - 00:15:40 up 21:56, 1 user, load average: 31.09, 29.87, 29.92
Tasks: 871 total, 1 running, 868 sleeping, 0 stopped, 2 zombie
%Cpu(s): 96.8 us, 0.4 sy, 2.7 id, 0.1 wa
...Aggregates many of the above metrics in a live view. It can be sorted by CPU, memory, or other columns to quickly locate the most resource‑intensive processes. Because it refreshes continuously, pausing the output may be necessary for detailed analysis.
Conclusion
These ten commands—combined with careful interpretation of their output—allow you to pinpoint whether a performance problem stems from CPU saturation, memory pressure, disk I/O bottlenecks, or network issues, and to identify the offending processes (e.g., Java processes consuming many cores) for targeted remediation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
