Operations 17 min read

Master Linux Server Monitoring: Essential Tools & Metrics Explained

This guide walks through essential Linux server monitoring tools—top, vmstat, pidstat, iostat, netstat, sar, and tcpdump—explaining their output fields, what each metric reveals about CPU, memory, disk I/O, and network performance, and how to use them for effective troubleshooting and capacity planning.

Open Source Linux
Open Source Linux
Open Source Linux
Master Linux Server Monitoring: Essential Tools & Metrics Explained

Preface

A Linux server constantly produces a wealth of parameter data that is crucial for operations staff and system administrators, and also valuable for developers when troubleshooting abnormal program behavior.

Introduction

This article lists simple tools for viewing system parameters; many of them parse data from /proc and /sys, while more advanced performance monitoring and tuning may require specialized tools such as perf or systemtap.

1. CPU and Memory

1.1 top

Command:

top

The first line shows the 1‑, 5‑, and 15‑minute load averages; values exceeding the number of CPU cores indicate CPU saturation.

The second line reports task states: running, sleeping (interruptible/uninterruptible), stopped, zombie, etc.

The third line breaks down CPU usage into user (us), system (sy), nice (ni), idle (id), iowait (wa), irq, softirq, and steal (st) percentages, each with a brief explanation.

High percentages suggest specific investigation paths, e.g., high us points to a CPU‑intensive process, high sy may indicate heavy I/O, high wa signals I/O bottlenecks, and high st can reveal VM over‑commitment.

The fourth and fifth lines display physical and virtual memory information. total = free + used + buff/cache. Buffers cache raw disk metadata, while Cached stores file data. Avail Mem indicates memory available without swapping.

Swap usage is not inherently bad, but frequent swap‑in/out suggests memory pressure.

Finally, the process list shows per‑process resource consumption; note that running top itself consumes CPU.

1.2 vmstat

vmstat

provides another view of system load. Columns include r (runnable processes), b (uninterruptible sleep), swpd (used swap), buffers, cached, bi/bo (blocks I/O), in (interrupts per second), cs (context switches).

Example output shows that increasing the -j compile parallelism does not significantly affect context‑switch count until a high value is used.

1.3 pidstat

pidstat

offers detailed per‑process statistics, including page faults, stack usage, CPU usage, and thread‑level context switches. Useful options:

-r: page faults (minor minflt/s, major majflt/s)

-s: stack size ( StkSize) and usage ( StkRef)

-u: CPU usage

-w: thread context switches ( cswch/s, nvcswch/s)

-C pattern -l: filter by command name and show full command line

Example:

pidstat -w -t -C "ailaw" -l

1.4 Other CPU tools

For per‑CPU monitoring, mpstat -P ALL 1 shows load distribution across cores.

To filter processes by user: top -u taozj or use ps with custom format, e.g.

while :; do ps -eo user,pid,ni,pri,pcpu,psr,comm | grep 'ailawd'; sleep 1; done

Process tree can be displayed with ps axjf.

2. Disk I/O

2.1 iostat

Command: iostat -xz 1. Key metrics:

avgqu-s: average queue length; >1 indicates saturation.

await (r_await, w_await): average I/O wait time.

svctm: average service time.

%util: device utilization; >60% may degrade performance.

Even if I/O appears slow, kernel asynchronous I/O and caching can mask impact on applications.

3. Network

3.1 netstat

Show protocol statistics since boot: netstat -s. For active connections use:

netstat --all --numeric --tcp --udp --timers --listening --program

Common shortcuts: netstat -antp (all TCP), netstat -nltp (listening TCP).

3.2 sar

sar -n TCP,ETCP 1

reports TCP activity (active/s, passive/s, retrans/s, isegerr/s). sar -n UDP 1 reports UDP metrics (noport/s, idgmerr/s).

3.3 tcpdump

tcpdump

captures packets for offline analysis with Wireshark. Use filters to limit capture size ( -C, -W) and reduce performance impact.

When capturing, configure filters carefully to avoid excessive load on the production system.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceLinuxsystem-monitoringtopnetstatiostatvmstatpidstat
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.