Operations 17 min read

Essential Linux Server Monitoring Tools and How to Interpret Their Metrics

This article introduces key Linux monitoring utilities—top, vmstat, pidstat, iostat, netstat, sar, and tcpdump—explains the meaning of their output fields, and shows how to use them to diagnose CPU, memory, disk, and network performance issues on production servers.

Efficient Ops

Aug 25, 2019

Essential Linux Server Monitoring Tools and How to Interpret Their Metrics

Linux servers constantly expose a wealth of performance parameters that are crucial for both operations engineers and developers when troubleshooting abnormal program behavior.

Simple command‑line tools read data from /proc and /sys to present these metrics; more advanced analysis may require specialized utilities such as perf or systemtap.

CPU and Memory Monitoring

top

top

displays load averages, task states, and per‑CPU usage. The first line shows 1‑, 5‑, and 15‑minute load averages; values exceeding the number of CPU cores indicate saturation. The second line lists task counts (running, sleeping, stopped, zombie). Subsequent columns break down CPU time into user (us), system (sy), nice (ni), idle (id), iowait (wa), irq (hi), softirq (si), and steal (st). High values in each column suggest specific investigation paths, such as locating CPU‑intensive processes, checking I/O‑bound kernel activity, or detecting hypervisor over‑commit.

The fourth and fifth lines report physical and virtual memory. total = free + used + buff/cache; Buffers cache raw disk metadata, while Cached stores file data. Avail Mem indicates memory readily usable without swapping. Frequent swap activity signals memory pressure.

Note that top itself consumes resources and is best for short‑term, interactive monitoring.

vmstat

vmstat

provides a snapshot of processes, memory, paging, block I/O, traps, and CPU activity. Columns include runnable processes (r), uninterruptible sleep (b), swapped memory (swpd), buffers, cached, block I/O (bi/bo), interrupts (in), and context switches (cs).

Experiments with different -j values when compiling show that context‑switch rates remain stable until the parallelism level is pushed high enough to cause noticeable increases.

pidstat

pidstat

offers per‑process statistics, including page faults ( minflt/s minor, majflt/s major), stack usage, CPU usage, and thread‑level context switches. Options such as -t (thread view), -r (memory), -s (stack), -u (CPU), and -w (context switches) make it ideal for deep analysis of individual or multithreaded programs.

Other CPU Tools

For per‑CPU inspection on SMP systems, mpstat -P ALL 1 shows load distribution across cores. Filtering top by user ( top -u username) or using ps with custom columns can isolate specific processes, and ps axjf visualizes process trees.

Disk I/O Monitoring

iotop

visualizes real‑time disk read/write rates per process. lsof reveals which processes hold files or devices open, useful for diagnosing unmount failures. iostat -xz 1 reports key disk metrics: average queue length ( avgqu-sz), average request latency ( await), service time ( svctm), and utilization ( %util). Values >1 for avgqu-sz or >60% for %util indicate potential saturation.

These metrics also apply to network file systems, though kernel I/O caching can mask some performance impacts.

Network Monitoring

Network health is critical for servers. iptraf and sar -n DEV 1 show interface throughput and utilization.

netstat

netstat -s

displays cumulative protocol statistics since boot; netstat -antp lists active TCP connections, while netstat -nltp shows listening sockets.

sar

Using sar -n TCP,ETCP 1 provides per‑second TCP metrics such as active opens, passive opens, retransmissions, and input errors. For UDP, sar -n UDP 1 reports packets received on closed ports and input errors, helping assess reliability.

tcpdump

tcpdump

captures raw packets for offline analysis with Wireshark. It supports size‑based rotation ( -C / -W) and extensive filtering (interface, host, port, protocol). Captured packets include timestamps, enabling precise reconstruction of connection sequences, though the tool adds overhead that must be considered in production.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux System Monitoring top performance metrics tcpdump netstat iostat vmstat pidstat

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.