How to Diagnose Linux Server Performance Issues in Minutes
A step‑by‑step guide shows how to use Linux commands like top, vmstat, free, iostat, and ss to quickly identify CPU overload, memory pressure, disk I/O bottlenecks, and network port problems, providing a practical cheat sheet for effective server troubleshooting.
01. System Load (top)
First check the “vital signs” with top and focus on the first five lines.
top - 14:28:14 up 100 days, 3:30, 2 users, load average: 5.15, 4.05, 2.50
Tasks: 201 total, 2 running, 199 sleeping, 0 stopped, 0 zombie
%Cpu(s): 15.2 us, 5.1 sy, 0.0 ni, 45.5 id, 33.9 wa, 0.0 hi, 0.3 si
MiB Mem : 15886.0 total, 2510.5 free, 8213.2 used, 5162.3 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 7125.0 avail MemLoad Average : the three numbers (1, 5, 15‑minute) are 5.15, 4.05, 2.50. On a 4‑core box a 5.15 load means processes are queuing – the system is overloaded.
%Cpu(s) : us 15.2% (user) is modest, but wa 33.9% is a warning sign that the CPU spends a lot of time waiting for disk I/O, indicating a disk bottleneck.
02. Deep Dive: CPU Queues (vmstat)
If top is inconclusive, run vmstat 1 to refresh every second.
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
6 0 0 251014 120560 516230 0 0 0 40 1205 3452 85 10 5 0 0
7 0 0 250980 120560 516230 0 0 0 0 1340 5102 90 9 1 0 0r (running) : values 6, 7 on a 4‑core machine indicate CPU capacity is insufficient.
b (blocked) : 0 means no processes are stuck in uninterruptible sleep; a non‑zero value would suggest I/O wait.
us (user) : 85‑90% shows the application (e.g., Java, Python) is consuming most CPU time.
03. Memory Check (free)
Run free -h to see memory usage.
total used free shared buff/cache available
Mem: 15Gi 4.2Gi 2.5Gi 1.0Gi 8.8Gi 10Gi
Swap: 2.0Gi 0B 2.0Giused : 4.2 Gi.
buff/cache : 8.8 Gi – Linux uses free memory for cache; this is normal, not a leak.
available : 10 Gi – the key metric; if it is large the system has plenty of usable memory. Do not restart services just because used + buff/cache looks high.
04. Disk I/O (iostat)
High wa from top suggests disk contention. Verify with iostat -xz 1.
Device r/s w/s rkB/s wkB/s %rrqm %wrqm %util await svctm
vda 0.00 450.00 0.00 12500.00 0.00 0.00 99.5 12.5 2.10
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.0 0.0 0.0%util : 99.5% means the disk is almost fully saturated; any additional I/O will queue.
await : 12.5 ms – on SSD this is high; on HDD anything above 10 ms indicates a slowdown.
Who is writing to disk? Use iotop -o to list the processes (PID) that are performing heavy reads/writes, similar to top .
05. Network Ports (ss)
Replace the slow netstat with the modern ss command.
ss -lntp
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1234,fd=3))
LISTEN 0 100 *:8080 *:* users:(("java",pid=2561,fd=10))
LISTEN 0 50 *:3306 *:* users:(("mysqld",pid=1890,fd=15))Recv‑Q / Send‑Q : If Recv‑Q exceeds Send‑Q while the socket is in LISTEN, the connection queue is full and new requests will be rejected.
Process : The output shows which binary (e.g., java, mysqld) owns each listening port and its PID.
Summary Cheat Sheet
top→ Check load average, us vs wa. vmstat 1 → Look at r for CPU queue length. free -h → Ignore used; focus on available. iostat -xz 1 → Verify %util is not near 100%. ss -lntp → Inspect listening ports and queue sizes.
Avoid Common Pitfalls
Do not blindly run echo 3 > /proc/sys/vm/drop_caches; it can trigger an I/O storm and worsen latency.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
