How to Diagnose Linux Performance in the First 60 Seconds
Learn the essential Linux command-line tools and step-by-step commands you need to run within the first minute of logging into a server to quickly assess process activity, resource usage, and potential bottlenecks, enabling effective performance troubleshooting in production environments.
Netflix’s Performance Engineering team shares a practical, 60‑second checklist for diagnosing Linux performance problems using standard command‑line utilities that are typically available on any server.
Overview
Running ten specific commands gives you a quick snapshot of processes, resource saturation, and error messages, helping you identify bottlenecks early. Some commands require the sysstat package. The output supports the USE method (Utilization, Saturation, Errors) for pinpointing issues.
1. uptime
Shows the system’s average load over the past 1, 5, and 15 minutes. High short‑term load compared to longer‑term averages may indicate a recent spike. Use vmstat or mpstat for deeper CPU analysis.
2. dmesg | tail
Displays the last ten kernel messages, revealing errors such as OOM kills or TCP drops. Always check this output first.
3. vmstat 1
Provides per‑second virtual memory statistics. Important columns:
r : runnable and uninterruptible processes; values greater than CPU count suggest CPU saturation.
free : free memory in KB (use free -m for clearer view).
si, so : swap‑in and swap‑out rates; non‑zero values indicate memory pressure.
us, sy, id, wa, st : CPU time breakdown (user, system, idle, I/O wait, stolen). High wa points to I/O bottlenecks; high sy (>20%) may signal kernel inefficiency.
4. mpstat -P ALL 1
Shows per‑CPU utilization, useful for spotting imbalanced load or single‑threaded bottlenecks.
5. pidstat 1
Similar to top but provides a rolling per‑process summary. Example output highlighted two Java processes consuming ~16 CPU cores each (shown as 1591% total CPU).
6. iostat -xz 1
Monitors block device performance. Key fields:
r/s, w/s, rkB/s, wkB/s : read/write request rates and throughput.
await : average I/O latency (including queue time); high values suggest device saturation.
avgqu‑sz : average queue length; >1 indicates saturation.
%util : device busy percentage; >60% often means poor performance, >90‑100% indicates saturation.
Remember that poor disk I/O does not always stem from the application; asynchronous I/O and caching can mask issues.
7. free -m
Shows memory usage, buffers, and cache. The “-/+ buffers/cache” line gives a more accurate view of used memory. Large buffers/cache values may hide actual free memory; on systems with ZFS, cache is reported separately.
8. sar -n DEV 1
Reports network interface throughput (rxkB/s, txkB/s) and utilization. Example shows eth0 handling 22 MiB/s (≈176 Mbit/s) well below a 1 Gbit/s limit. %ifutil can be used for interface utilization.
9. sar -n TCP,ETCP 1
Summarizes TCP metrics: active/s (outbound connections), passive/s (inbound connections), and retrans/s (retransmissions). Low connection rates are normal; spikes in retransmissions often indicate network or server overload.
10. top
Provides a real‑time view of many of the metrics already covered. It is useful for a quick glance but less effective for spotting trends compared to the rolling outputs of vmstat or pidstat. Use Ctrl‑S/Ctrl‑Q to pause output when needed.
Further Analysis
For deeper investigation, refer to Brendan Gregg’s Linux performance tools talk from Velocity 2015, which covers over 40 commands for observability, benchmarking, tuning, static analysis, and tracing.
The article also includes a brief recruitment note from Netflix, which is omitted as non‑technical content.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
