Diagnose Linux Server Bottlenecks in 60 Seconds with 10 Essential Commands
When a Linux server suddenly spikes in load, this guide shows how to pinpoint the root cause within a minute by running ten key commands that reveal CPU, memory, disk I/O, and network metrics, enabling rapid performance troubleshooting.
Overview
If a Linux server’s load jumps dramatically and alerts flood your phone, you can identify the performance problem in under a minute by executing a short list of commands recommended by Netflix’s performance engineering team.
Command list
uptime
dmesg | tail
vmstat 1
mpstat -P ALL 1
pidstat 1
iostat -xz 1
free -m
sar -n DEV 1
sar -n TCP,ETCP 1
top
Some of these commands require the sysstat package, while others are provided by procps. Their output follows the USE method (Utilization, Saturation, Errors) to quickly locate bottlenecks.
uptime
$ uptime
23:51:26 up 21:31, 1 user, load average: 30.02, 26.43, 19.02The three numbers are the 1‑, 5‑, and 15‑minute average loads, indicating how many processes are waiting for CPU or blocked in uninterruptible I/O. A high 1‑minute load with a lower 15‑minute load suggests a recent spike that needs further investigation.
dmesg | tail
$ dmesg | tail
[1880957.563150] perl invoked oom‑killer: gfp_mask=0x280da, order=0, oom_score_adj=0
[1880957.563400] Out of memory: Kill process 18694 (perl) score 246 or sacrifice child
[1880957.563408] Killed process 18694 (perl) total‑vm:1972392kB, anon‑rss:1953348kB, file‑rss:0kB
[2320864.954447] TCP: Possible SYN flooding on port 7001. Dropping request. Check SNMP counters.The last ten kernel log lines can reveal out‑of‑memory kills or network anomalies such as SYN floods, which are valuable clues during troubleshooting.
vmstat 1
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
34 0 0 200889792 73708 591828 0 0 0 0 5 6 10 96 1 3 0 0
...Key columns:
r : processes waiting for CPU (if > number of cores, CPU is saturated).
free : free memory in kilobytes.
si/so : swap in/out (non‑zero indicates swapping).
us, sy, id, wa, st : user, system, idle, I/O wait, and stolen CPU time.
High r or low id points to CPU pressure; high wa suggests I/O bottlenecks.
mpstat -P ALL 1
$ mpstat -P ALL 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:38:49 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
all 98.47 0.00 0.75 0.00 0.00 0.00 0.00 0.00 0.00 0.78
0 96.04 0.00 2.97 0.00 0.00 0.00 0.00 0.00 0.00 0.99
1 97.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00
...This per‑CPU view highlights any core that is unusually busy, which often indicates a single‑threaded workload monopolizing that CPU.
pidstat 1
$ pidstat 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:41:02 PM UID PID %usr %system %guest %CPU CPU Command
07:41:03 PM 0 9 0.00 0.94 0.00 0.94 1 rcuos/0
07:41:03 PM 0 4214 5.66 5.66 0.00 11.32 15 mesos‑slave
07:41:03 PM 0 6521 1596.23 1.89 0.00 1598.11 27 java
07:41:03 PM 0 6564 1571.70 7.55 0.00 1579.25 28 java
...Each line shows a process’s CPU usage. Values above 100 % mean the process is using more than one core; the example shows two Java processes consuming roughly 16 cores together.
iostat -xz 1
$ iostat -xz 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
73.96 0.00 3.73 0.03 0.06 22.21
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq‑sz avgqu‑sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 4.52 2.08 34.37 0.00 0.00 2.44 0.09
xvdb 0.00 0.01 0.00 1.02 127.97 598.53 145.79 0.00 1.78 0.00 0.28 0.25 0.25
...Important columns:
r/s, w/s, rkB/s, wkB/s : read/write operations and throughput.
await : average I/O wait time (ms).
avgqu‑sz : average queue length; >1 indicates saturation.
%util : device utilization; >60 % may degrade performance, 100 % means full saturation.
High await or %util points to disk I/O bottlenecks.
free -m
$ free -m
total used free shared buffers cached
Mem: 245998 24545 221453 83 59 541
-/+ buffers/cache: 23944 222053
Swap: 0 0 0The second line (‑/+ buffers/cache) shows memory that is truly available for applications; Linux uses free RAM for cache, which is reclaimed when needed.
sar -n DEV 1
$ sar -n DEV 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
12:16:48 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:16:49 AM eth0 18763.00 5032.00 20686.42 478.30 0.00 0.00 0.00 0.00
12:16:49 AM lo 14.00 14.00 1.36 1.36 0.00 0.00 0.00 0.00Network interface statistics help determine whether the NIC is saturated. In the example, eth0 handles ~22 MB/s, far below a 1 Gbps link capacity.
sar -n TCP,ETCP 1
$ sar -n TCP,ETCP 1
12:17:20 AM active/s passive/s iseg/s oseg/s
12:17:20 AM 1.00 0.00 10233.00 18846.00
12:17:20 AM atmptf/s estres/s retrans/s isegerr/s orsts/s
12:17:20 AM 0.00 0.00 0.00 0.00 0.00Metrics such as active/s (outgoing connections) and retrans/s (TCP retransmissions) indicate whether connection churn or packet loss contributes to performance issues.
top
$ top
top - 00:15:40 up 21:56, 1 user, load average: 31.09, 29.87, 29.92
Tasks: 871 total, 1 running, 868 sleeping, 0 stopped, 2 zombie
%Cpu(s): 96.8 us, 0.4 sy, 2.7 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 25190241+total, 24921688 used, 22698073+free, 60448 buffers
KiB Swap: 0 total, 0 used, 0 free. 554208 cached Mem
... topaggregates many of the previous metrics (load, memory, CPU) and allows interactive sorting to find the most resource‑hungry processes. However, because it shows a snapshot, it should be paused or combined with the other commands for a complete picture.
Conclusion
These ten commands provide a rapid, low‑overhead way to diagnose Linux server performance problems. By correlating their outputs—especially high CPU usage from pidstat, I/O saturation from iostat, and memory pressure from free —you can pinpoint the offending subsystem and focus subsequent tuning efforts on the relevant application or hardware component.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
