Operations 18 min read

Diagnose Linux Server Performance in 1 Minute with 10 Essential Commands

When a Linux server suddenly spikes in load and alerts flood your phone, you can pinpoint the cause within a minute by using ten powerful commands from Netflix's performance engineering team to quickly assess CPU, memory, disk I/O, and network metrics.

MaGe Linux Operations

Apr 10, 2023

Diagnose Linux Server Performance in 1 Minute with 10 Essential Commands

If your Linux server suddenly spikes in load and alerts flood your phone, you can pinpoint the cause in under a minute using ten powerful commands recommended by Netflix’s performance engineering team.

Overview

Running the following commands gives you a quick snapshot of system resource usage.

uptime

dmesg | tail

vmstat 1

mpstat -P ALL 1

pidstat 1

iostat -xz 1

free -m

sar -n DEV 1

sar -n TCP,ETCP 1

top

Some commands require the sysstat package, others are provided by procps. Their output follows the USE method (Utilization, Saturation, Errors) to locate bottlenecks.

uptime

$ uptime
23:51:26 up 21:31,  1 user,  load average: 30.02, 26.43, 19.02

This shows the 1‑, 5‑, and 15‑minute load averages, indicating how many processes are waiting for CPU or blocked I/O.

dmesg | tail

$ dmesg | tail
[1880957.563150] perl invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
...
[2320864.954447] TCP: Possible SYN flooding on port 7001. Dropping request.  Check SNMP counters.

Shows the last few kernel messages, useful for spotting OOM kills or network errors.

vmstat 1

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so   bi   bo   in   cs us sy id wa st
34  0    0 200889792 73708 591828   0    0    0    5    6   10 96 1 3 0 0
...

Displays per‑second statistics; key columns include r (processes waiting for CPU), free (available memory), si/so (swap activity), and CPU time breakdown ( us, sy, id, wa, st).

mpstat -P ALL 1

$ mpstat -P ALL 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:38:49 PM  CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
07:38:50 PM  all  98.47 0.00 0.75 0.00 0.00 0.00 0.00 0.00 0.00 0.78
...

Shows utilization per CPU; a single CPU with high usage often points to a single‑threaded bottleneck.

pidstat 1

$ pidstat 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
07:41:02 PM   UID   PID   %usr %system %guest %CPU CPU Command
07:41:03 PM    0    9    0.00   0.94   0.00  0.94  1  rcuos/0
...

Lists each process’s CPU consumption; in the example two Java processes consume ~1600 % CPU, indicating they are using many cores.

iostat -xz 1

$ iostat -xz 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
...
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda   0.00   0.23 0.21 0.18 4.52 2.08 34.37 0.00 9.98 13.80 5.42 2.44 0.09
...

Provides disk I/O metrics; important fields are r/s, w/s, rkB/s, wkB/s (throughput), await (average wait), avgqu-sz (queue length), and %util (device utilization).

free -m

$ free -m
              total   used   free  shared buffers cached
Mem:        245998  24545 221453    83     59    541
-/+ buffers/cache: 23944 222053
Swap:            0      0      0

Shows memory usage; the “-/+ buffers/cache” line reflects Linux’s strategy of using free memory for cache, which can be reclaimed when needed.

sar -n DEV 1

$ sar -n DEV 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
12:16:48 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:16:49 AM eth0 18763.00 5032.00 20686.42 478.30 0.00 0.00 0.00 0.00
...

Displays network interface throughput; helps determine whether the NIC is saturated.

sar -n TCP,ETCP 1

$ sar -n TCP,ETCP 1
Linux 3.13.0-49-generic (titanclusters-xxxxx) 07/14/2015 _x86_64_ (32 CPU)
12:17:19 AM active/s passive/s iseg/s oseg/s
12:17:20 AM   1.00   0.00 10233.00 18846.00
...

Shows TCP statistics such as active connections per second and retransmissions, useful for diagnosing network‑related performance issues.

top

$ top
top - 00:15:40 up 21:56, 1 user, load average: 31.09, 29.87, 29.92
...

Combines many of the above metrics in a single, real‑time view and allows sorting by CPU, memory, or other columns.

Conclusion

The ten commands above enable rapid identification of Linux server bottlenecks. In the sample output, Java processes were consuming massive CPU, guiding the next steps of performance tuning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance monitoring Linux Troubleshooting

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.