Operations 17 min read

Master Linux Server Monitoring: Essential Tools and How to Use Them

This guide explains how to use common Linux monitoring utilities—top, vmstat, pidstat, iostat, sar, netstat, and tcpdump—to observe CPU, memory, disk I/O, and network performance, interpret their output, and troubleshoot system bottlenecks effectively.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master Linux Server Monitoring: Essential Tools and How to Use Them

CPU and Memory Monitoring

1.1 top

The top command shows real‑time system load, task states, and per‑CPU usage. The first line displays 1‑, 5‑, and 15‑minute load averages; values exceeding the number of CPU cores indicate saturation. The second line lists task counts (running, sleeping, stopped, zombie). Subsequent lines break down CPU time into us (user), sy (system), ni (nice), id (idle), wa (iowait), hi (hardware interrupt), si (software interrupt), and st (steal) percentages, each pointing to different performance issues. The memory section shows total, free, used, buffers, and cached values; avail Mem reflects memory available without swapping. High user usage often means a specific process is CPU‑bound, while high system suggests heavy I/O or kernel activity. Elevated iowait, irq / softirq, or steal values hint at storage bottlenecks, hardware problems, or virtual‑machine over‑commitment.

1.2 vmstat

vmstat

provides a snapshot of processes, memory, paging, block I/O, and CPU activity. Columns include r (runnable processes), b (blocked/uninterruptible), swpd (used swap), free, buff, cache, I/O blocks per second ( bi / bo), interrupts ( in), and context switches ( cs). The tool is useful for correlating compile‑time -j settings with system load; a sharp rise in cs indicates excessive context switching.

1.3 pidstat

pidstat

offers per‑process and per‑thread statistics. Useful flags include: -r: reports page faults (minor minflt/s and major majflt/s) and memory usage. -s: shows stack size ( StkSize) and actual usage ( StkRef). -u: CPU usage breakdown similar to top. -w: thread context‑switch counts, split into voluntary ( cswch/s) and involuntary ( nvcswch/s). -C "pattern" and -l: filter by command name and display full command line.

Running pidstat -w -t -C "ailaw" -l yields detailed thread‑level metrics, making it superior to ps for multi‑threaded debugging.

1.4 Other CPU Tools

For per‑CPU analysis on SMP systems, mpstat -P ALL 1 shows load distribution across cores. Filtering top by user ( top -u username) or using a custom ps loop can isolate specific processes.

Disk I/O Monitoring

2.1 iostat

iostat -xz 1

reports extended disk statistics. Key fields are: avgqu-s: average queue length; values > 1 suggest device saturation. await / r_await / w_await: average I/O wait time (ms). svctm: average service time; close to await means little queuing. %util: percentage of time the device is busy; > 60 % degrades performance, approaching 100 % indicates saturation.

Even if I/O appears slow, kernel caching and asynchronous I/O may mask impact on applications; the same metrics apply to network file systems.

Network Monitoring

3.1 netstat

netstat -s

displays cumulative protocol statistics since boot. Use netstat -antp for active TCP connections and netstat -nltp for listening sockets. Adding --timers disables reverse DNS lookups for faster output.

3.2 sar

The sar utility can monitor network activity with -n TCP,ETCP 1 for TCP and -n UDP 1 for UDP. Important counters include: active/s: outgoing connection attempts. passive/s: incoming connection attempts. retrans/s (or tcpRetransSegs): TCP retransmissions per second. isegerr/s (or tcpInErrs): received packets with errors. noport/s (or udpNoPorts): UDP datagrams received on unopened ports. idgmerr/s (or udpInErrors): other UDP receive errors.

These figures help assess network reliability when correlated with application requirements.

3.3 tcpdump

tcpdump

captures live packet traces. Use filters (e.g., dst port 80) to limit capture size; the -C and -W options rotate files automatically. Captured pcap files can be analyzed offline with Wireshark. Example: capturing Chrome’s three‑handshake sequence demonstrates how to isolate connection setup packets while minimizing performance impact on the host.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceLinuxsystem-monitoringtoptcpdumpnetstatsariostatvmstatpidstat
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.