Linux Performance Analysis in the First 60 Seconds: Essential Commands
The article outlines a rapid, one‑minute health check for Linux servers using ten common command‑line tools—uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top—to quickly reveal CPU, memory, I/O, network saturation and error conditions before deeper analysis.
This article, translated from the Netflix Tech Blog "Linux Performance Analysis in 60,000 Milliseconds," explains how to obtain a quick, high‑level view of a Linux server’s health within the first minute after login. It focuses on using standard Linux command‑line tools that are available on most distributions, following the USE (Utilization, Saturation, Errors) methodology.
Summary of the first 60 seconds
Running the ten commands listed below gives insight into CPU load, memory usage, I/O activity, network traffic, and recent kernel messages. The output is easy to read and helps identify saturation points or error conditions.
uptime dmesg | tail vmstat 1 mpstat -P ALL 1 pidstat 1 iostat -xz 1 free -m sar -n DEV 1 sar -n TCP,ETCP 1 topSome of these commands belong to the sysstat package and may need to be installed first.
Command details
1. uptime – Shows the system’s load average for the past 1, 5, and 15 minutes and the number of users. A high 1‑minute load relative to the 15‑minute load often indicates a recent spike.
2. dmesg | tail – Prints the last kernel messages. Useful for spotting OOM kills, device errors, or network‑related warnings.
3. vmstat 1 – Provides per‑second snapshots of processes, memory, swap, I/O, and CPU statistics. Columns such as r (runnable processes) indicate CPU saturation, while si/so show swap activity.
4. mpstat -P ALL 1 – Shows CPU usage per core. High utilization on a single core may reveal a single‑threaded bottleneck.
5. pidstat 1 – Similar to top but prints per‑process CPU usage continuously, making it easy to track which processes consume many cores (e.g., a Java process showing >1500% CPU means it is using ~15 cores).
6. iostat -xz 1 – Reports device‑level I/O statistics. Important fields include r/s , w/s , rkB/s , wkB/s , await , avgqu‑sz , and %util . Values such as %util > 60% suggest disk saturation.
7. free -m – Displays total, used, and free memory, as well as buffers and cache. The line "-/+ buffers/cache" gives a clearer picture of memory actually available to applications.
8. sar -n DEV 1 – Monitors network interface throughput ( rxkB/s , txkB/s ) and can reveal whether a NIC is approaching its bandwidth limit.
9. sar -n TCP,ETCP 1 – Shows TCP statistics such as active connections, passive connections, and retransmissions. A high retransmission rate often signals network problems or server overload.
10. top – Provides a real‑time view of processes, CPU, and memory usage. While it is useful for a quick glance, its screen‑refresh nature makes it harder to spot trends compared with the rolling output of vmstat or pidstat .
After the initial 60‑second snapshot, the article suggests deeper analysis using additional tools (e.g., Brendan Gregg’s Linux performance toolkit) that cover observability, benchmarking, tuning, and tracing.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.