Operations 18 min read

How to Diagnose Linux Server Issues in the First 60 Seconds with 10 Essential Commands

This article explains how Netflix's performance team uses ten standard Linux command‑line tools—uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top—to quickly assess system health, resource saturation and errors within the first minute of a performance incident.

Open Source Linux

Sep 6, 2021

How to Diagnose Linux Server Issues in the First 60 Seconds with 10 Essential Commands

When a Linux server shows performance problems, the first minute is critical. Netflix's performance engineering team shares ten standard command‑line tools that reveal system health, resource saturation and errors.

1. uptime

uptime

The command shows the system load averages for the past 1, 5 and 15 minutes, indicating how many tasks are waiting to run.

2. dmesg | tail

dmesg | tail

Displays the latest kernel messages, useful for spotting OOM kills or network errors.

3. vmstat 1

vmstat 1

Shows virtual memory, CPU and I/O statistics every second; key fields include r (runnable tasks), free memory, si/so (swap activity) and us/sy/id/wa.

4. mpstat -P ALL 1

mpstat -P ALL 1

Prints per‑CPU utilization, helping to identify single‑threaded bottlenecks.

5. pidstat 1

pidstat 1

Reports CPU usage per process at regular intervals, allowing you to spot processes that consume many CPUs.

6. iostat -xz 1

iostat -xz 1

Provides detailed block‑device statistics such as r/s, w/s, await and %util to detect disk saturation.

7. free -m

free -m

Shows total, used and free memory, buffers and cache; the “-/+ buffers/cache” line gives a more accurate view of usable memory.

8. sar -n DEV 1

sar -n DEV 1

Monitors network interface throughput (rxkB/s, txkB/s) and interface utilization.

9. sar -n TCP,ETCP 1

sar -n TCP,ETCP 1

Summarizes TCP activity: active/passive connections, retransmissions and packet errors.

10. top

top

Combines many of the above metrics in a dynamic view; useful for a quick sanity check but may miss intermittent spikes.

By following the USE (Utilization, Saturation, Errors) methodology and checking these metrics in order, you can quickly narrow down the root cause of performance degradation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring Sysadmin Server command-line

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.