Master Linux System Monitoring: Deep Dive into CPU, Memory, and I/O Metrics
This comprehensive guide explains how to collect and analyze Linux system metrics—including CPU usage, memory consumption, disk I/O, and load average—using native /proc and /sys interfaces, popular command‑line tools, and Prometheus Node Exporter, with practical scripts, configuration examples, and troubleshooting case studies for reliable performance monitoring and capacity planning.
Overview
Linux exposes performance data through /proc and /sys. Tools such as vmstat, iostat, top, perf and eBPF‑based collectors read these virtual files to provide CPU, memory, disk I/O and network metrics. Understanding each metric and its collection method is essential for capacity planning, performance tuning, fault diagnosis and SLA‑driven alerting.
Technical Characteristics
Kernel‑level data source : All metrics are real‑time kernel data, offering high precision with minimal overhead.
Full‑stack coverage : CPU scheduling, memory management, block‑device I/O and network stack are all observable.
Mature toolchain : From basic utilities ( vmstat, iostat) to advanced profilers ( perf, eBPF).
Standardised output : POSIX‑compatible formats simplify integration with Prometheus, Grafana and other monitoring platforms.
Applicable Scenarios
Capacity planning : Analyse historical metrics to predict scaling needs.
Performance tuning : Identify CPU‑ or I/O‑intensive workloads and adjust system parameters.
Fault diagnosis : Correlate multi‑dimensional metrics to locate root causes quickly.
SLA assurance : Build alerting policies based on metric thresholds.
Environment Requirements
Operating System : CentOS 8+/Ubuntu 20.04+/Debian 11+ (kernel ≥ 4.15 for cgroup v2 and eBPF).
procps‑ng ≥ 4.0 – provides ps, top, vmstat, free.
sysstat ≥ 12.7 – provides iostat, mpstat, sar.
htop ≥ 3.3 – interactive process monitor.
iotop ≥ 0.6 – real‑time block I/O monitor.
perf – kernel‑matched version for low‑level profiling.
Detailed Steps
Preparation
System environment check
# Verify kernel version (features require ≥4.15)
uname -r
# Confirm procfs is mounted
mount | grep proc
# List block devices via sysfs
ls /sys/class/block/
# Detect cgroup version (v2 preferred)
cat /sys/fs/cgroup/cgroup.controllers 2>/dev/null && echo "cgroup v2" || echo "cgroup v1"Install monitoring tools
RHEL/CentOS/Rocky Linux
# Install sysstat and basic utilities
sudo dnf install -y sysstat epel-release htop iotop perf
# Enable and start sysstat data collection
sudo systemctl enable --now sysstatDebian/Ubuntu
# Update package index
sudo apt update
# Install the full suite
sudo apt install -y sysstat htop iotop linux-tools-common linux-tools-$(uname -r)
# Enable sysstat collection
sudo sed -i 's/ENABLED="false"/ENABLED="true"/' /etc/default/sysstat
sudo systemctl restart sysstatKey virtual file locations
# CPU information
/proc/stat # CPU time counters
/proc/loadavg # Load average (1/5/15 min)
/proc/cpuinfo # CPU hardware details
# Memory information
/proc/meminfo # Detailed memory usage
/proc/vmstat # Virtual memory statistics
/proc/buddyinfo # Memory fragmentation
# Disk I/O information
/proc/diskstats # Block device counters
/sys/block/*/stat # Per‑device I/O counters
/proc/io # Process‑level I/O (root only)
# Network information
/proc/net/dev # Interface statistics
/proc/net/snmp # Protocol countersCPU Monitoring Metrics
CPU time slice distribution
# Raw CPU statistics (first line of /proc/stat)
cat /proc/stat | head -1
# Formatted view per core (sample every second, 5 times)
mpstat -P ALL 1 5%user : User‑mode time (0‑70 % normal). High values indicate compute‑intensive workloads.
%nice : Time for nice‑adjusted processes (0‑5 %).
%system : Kernel‑mode time (0‑30 %). High values may reveal excessive system calls.
%iowait : CPU waiting for I/O (0‑20 %). Persistent high values point to I/O bottlenecks.
%irq / %softirq : Interrupt handling (≤5 % and ≤10 % respectively). Spike in %softirq often correlates with high network traffic.
%steal : Time stolen by the hypervisor in virtualised environments (≤5 %).
%idle : Idle time (20‑100 %). Low idle combined with high iowait signals I/O saturation.
System load (Load Average)
# Show load averages
cat /proc/loadavg
# Example output: 0.52 0.48 0.45 2/1089 28754
# Evaluate against CPU core count
nproc # or grep -c processor /proc/cpuinfoLoad < CPU cores – resources are sufficient.
Load ≈ CPU cores – system is fully loaded.
Load > CPU cores × 1.5 – contention, investigate.
Load > CPU cores × 2 – overload, response time may degrade.
Real‑time CPU usage
# Top (press 1 to expand per‑core view)
top -bn1 | head -20
# Htop for an interactive UI
htop
# Per‑process CPU usage
pidstat -u 1 5
# Sort processes by CPU consumption
ps aux --sort=-%cpu | head -10Key fields in top output:
top - 10:30:45 up 45 days, 3:28, 2 users, load average: 1.23, 0.98, 0.76
Tasks: 287 total, 2 running, 285 sleeping, 0 stopped, 0 zombie
%Cpu(s): 5.2 us, 2.1 sy, 0.0 ni, 91.5 id, 0.8 wa, 0.0 hi, 0.4 si, 0.0 stMemory Monitoring Metrics
Memory overview
# Full memory snapshot
cat /proc/meminfo
# Human‑readable summary
free -h
# Continuous monitoring (refresh every 2 s)
watch -n 2 free -hMemTotal : Physical memory size – baseline for usage percentages.
MemFree : Completely unused memory – not suitable for alerts.
MemAvailable : Estimable free memory (free + reclaimable cache) – recommended alert metric.
Buffers / Cached : Kernel buffers and page cache – reclaimable, not true consumption.
SwapTotal / SwapFree : Swap space – continuous decrease indicates memory pressure.
Memory usage calculation script
#!/bin/bash
# mem_usage.sh – compute actual used memory and percentage
MEM_TOTAL=$(awk '/MemTotal/ {print $2}' /proc/meminfo)
MEM_AVAILABLE=$(awk '/MemAvailable/ {print $2}' /proc/meminfo)
MEM_USED=$((MEM_TOTAL - MEM_AVAILABLE))
USAGE_PERCENT=$(echo "scale=2; $MEM_USED * 100 / $MEM_TOTAL" | bc)
echo "Total Memory: $((MEM_TOTAL/1024)) MB"
echo "Available Memory: $((MEM_AVAILABLE/1024)) MB"
echo "Used Memory: $((MEM_USED/1024)) MB"
echo "Memory Usage: ${USAGE_PERCENT}%"vmstat memory fields swpd: Used swap (KB). free: Free physical memory (KB). buff: Kernel buffers (KB). cache: Page cache (KB). si / so: Swap‑in / swap‑out rates (KB/s) – non‑zero values signal memory pressure.
# Sample vmstat (1 s interval, 10 samples)
vmstat 1 10Disk I/O Monitoring Metrics
iostat basic monitoring
# Extended mode, refresh every second
iostat -xz 1r/s, w/s : Read/write requests per second – device‑dependent baseline.
rkB/s, wkB/s : Throughput in KB/s – watch for approaching device limits.
r_await, w_await : Average read/write latency (ms). Typical thresholds: HDD < 20 ms, SSD < 5 ms.
aqu-sz : Average request queue length – values > 2 indicate I/O saturation.
%util : Device utilization – sustained > 70 % suggests the need for optimisation or scaling.
Raw block device statistics
# Show per‑device counters
cat /proc/diskstats
# Example field order (sda):
# major minor name reads_completed reads_merged reads_sectors read_time_ms \
# writes_completed writes_merged writes_sectors write_time_ms current_io io_time_ms weighted_io_time_msProcess‑level I/O with iotop
# Real‑time view (requires root)
iotop -o
# Batch mode, 5 samples, 1 s interval
iotop -b -n 5 -d 1
# Sort by I/O and show accumulated values
iotop -o -P -aConfiguration Examples
Prometheus Node Exporter service
# File: /etc/systemd/system/node_exporter.service
[Unit]
Description=Prometheus Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
After=network-online.target
[Service]
Type=simple
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter \
--web.listen-address=:9100 \
--web.telemetry-path=/metrics \
--collector.filesystem.mount-points-exclude="^/(sys|proc|dev|host|etc)($$|/)" \
--collector.netclass.ignored-devices="^(veth|docker|br-).*" \
--collector.diskstats.device-exclude="^(ram|loop|fd|dm-).*" \
--collector.cpu.info \
--collector.meminfo \
--collector.diskstats \
--collector.netdev \
--collector.loadavg \
--collector.vmstat \
--collector.filesystem \
--collector.pressure \
--no-collector.wifi \
--no-collector.nvme \
--no-collector.infiniband
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.targetPrometheus alerting rules (node_alerts.yml)
# File: /etc/prometheus/rules/node_alerts.yml
groups:
- name: node_resource_alerts
interval: 30s
rules:
- alert: HighCpuUsage
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage (instance: {{ $labels.instance }})"
description: "CPU usage > 85 % for 5 min, current value: {{ $value | printf \"%.1f\" }}%"
- alert: HighMemoryUsage
expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage (instance: {{ $labels.instance }})"
description: "Memory usage > 90 % for 5 min, current value: {{ $value | printf \"%.1f\" }}%"
- alert: DiskSpaceLow
expr: (1 - node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes) * 100 > 85
for: 10m
labels:
severity: warning
annotations:
summary: "Low disk space (instance: {{ $labels.instance }})"
description: "Filesystem {{ $labels.mountpoint }} usage > 85 %, current: {{ $value | printf \"%.1f\" }}%"
- alert: HighDiskLatency
expr: |
rate(node_disk_read_time_seconds_total[5m]) / rate(node_disk_reads_completed_total[5m]) * 1000 > 50 or
rate(node_disk_write_time_seconds_total[5m]) / rate(node_disk_writes_completed_total[5m]) * 1000 > 50
for: 5m
labels:
severity: warning
annotations:
summary: "High disk latency (instance: {{ $labels.instance }})"
description: "Device {{ $labels.device }} latency > 50 ms"
- alert: HighLoadAverage
expr: node_load15 / count without(cpu, mode) (node_cpu_seconds_total{mode="idle"}) > 1.5
for: 15m
labels:
severity: warning
annotations:
summary: "High load average (instance: {{ $labels.instance }})"
description: "15‑min load per core > 1.5, current: {{ $value | printf \"%.2f\" }}"
- alert: SwapUsageHigh
expr: (node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes) / node_memory_SwapTotal_bytes * 100 > 50
for: 10m
labels:
severity: warning
annotations:
summary: "High swap usage (instance: {{ $labels.instance }})"
description: "Swap usage > 50 %, current: {{ $value | printf \"%.1f\" }}%"Real‑World Cases
Case 1 – MySQL performance diagnosis
#!/bin/bash
# mysql_server_diagnosis.sh – quick snapshot for a MySQL host
echo "========== MySQL Server Performance Diagnosis =========="
echo "Time: $(date '+%Y-%m-%d %H:%M:%S')"
# Load
echo "
>>> System Load Average"
uptime
CORES=$(nproc)
LOAD1=$(awk '{print $1}' /proc/loadavg)
echo "CPU Cores: $CORES, Load/Core Ratio: $(echo "scale=2; $LOAD1/$CORES" | bc)"
# CPU distribution (5 s sample)
echo "
>>> CPU Usage Distribution"
mpstat 1 5 | tail -1
# Memory status
echo "
>>> Memory Status"
free -h
echo "MySQL Process Memory:"
ps aux | grep -E '^mysql|^USER' | head -2
# Disk I/O
echo "
>>> Disk I/O Status"
iostat -xz 1 3 | grep -E '^(Device|sd|nvme|vd)'
# Process I/O (requires root)
echo "
>>> MySQL Process I/O"
if [ $(id -u) -eq 0 ]; then
iotop -b -n 3 -d 1 -P | grep -i mysql
else
echo "Skip: requires root privilege"
fi
# Swap activity
echo "
>>> Swap Activity"
vmstat 1 5 | awk 'NR==1 || NR==2 || NR>2 {print}'
# Network connections
echo "
>>> MySQL Network Connections"
ss -tn state established '( dport = :3306 or sport = :3306 )' | wc -l
echo "========== Diagnosis Complete =========="Typical findings (example):
CPU iowait ≈ 45 % → severe I/O wait.
Available memory ≈ 2 GB on a 62 GB system; swap already in use.
Disk utilization ≈ 89 % with write latency > 45 ms → storage saturation.
Root cause: insufficient RAM forces MySQL buffer‑pool eviction, leading to frequent disk reads.
Remediation: increase RAM to ≥ 128 GB or tune innodb_buffer_pool_size to fit the workload.
Case 2 – Web server CPU spike investigation
# Step 1: Overall CPU snapshot
top -bn1 | head -15
# Step 2: Sort processes by CPU usage
ps aux --sort=-%cpu | head -20
# Step 3: Count PHP‑FPM workers
echo "PHP‑FPM worker count: $(ps aux | grep 'php-fpm: pool' | wc -l)"
# Step 4: Show processes consuming > 50 % CPU
ps aux --sort=-%cpu | awk '$3>50 {print $0}'
# Step 5: Detailed per‑process CPU with pidstat
pidstat -u 1 10 | sort -k8 -rn | head -20Further analysis with strace often reveals heavy regular‑expression processing in a specific API endpoint, confirming a ReDoS vulnerability.
Best Practices and Caveats
Monitoring Data Collection Optimisation
Balance precision and overhead by adjusting collection intervals.
# Sysstat collection every minute (production)
*/1 * * * * root /usr/lib64/sa/sa1 1 1
# Daily summary at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -AReal‑time alert metrics: 10‑30 s interval.
Trend analysis: 1‑5 min interval.
Historical archiving: hourly or daily aggregation.
# Run Node Exporter with a minimal set of collectors to reduce load
node_exporter \
--collector.disable-defaults \
--collector.cpu \
--collector.meminfo \
--collector.diskstats \
--collector.filesystem \
--collector.loadavg \
--collector.netdevAlert Threshold Design
CPU usage : warning ≥ 80 %, critical ≥ 95 % (sustained 5 min).
Memory usage : based on MemAvailable, warning ≥ 85 %, critical ≥ 95 % (5 min).
Disk space : warning ≥ 80 %, critical ≥ 90 % (10 min).
Disk I/O latency : warning ≥ 30 ms, critical ≥ 100 ms (3 min).
System load : warning > CPU cores × 1.2, critical > CPU cores × 2 (10 min).
Swap usage : warning ≥ 30 %, critical ≥ 60 % (10 min).
Data Retention Strategy
# Prometheus storage configuration (30 days raw, 100 GB max)
global:
scrape_interval: 15s
evaluation_interval: 15s
storage:
tsdb:
path: /var/lib/prometheus/data
retention.time: 30d
retention.size: 100GB
# For long‑term storage, integrate Thanos for down‑sampling and > 1 year retention.Fault Diagnosis and Performance Monitoring
Common Troubleshooting Flows
Log inspection
# Systemd journal (last 100 lines)
journalctl -xe --no-pager | tail -100
# Time‑range view
journalctl --since "2026-01-15 10:00:00" --until "2026-01-15 12:00:00"
# Kernel messages (OOM, hardware errors)
dmesg -T | tail -50
# OOM Killer records
dmesg -T | grep -i "out of memory"
journalctl -k | grep -i oomCPU problem investigation
# Locate high‑CPU processes
top -bn1 -o %CPU | head -15
# Thread‑level view for a specific PID
top -H -p $(pgrep -f "process_name")
# Hotspot analysis with perf
perf top -p $(pgrep -f "process_name")Memory problem investigation
# OOM event details
dmesg -T | grep -A 20 "Out of memory"
# Identify killed process
journalctl -k | grep -E "Killed process|oom-kill"
# Current memory pressure (PSI)
cat /proc/pressure/memoryI/O problem investigation
# Identify I/O wait source
iostat -xz 1 5
# Find processes generating I/O
iotop -o -b -n 5
# Per‑process I/O details
cat /proc/$(pgrep -f "process_name")/ioComprehensive Health‑Check Script
#!/bin/bash
# system_health_check.sh – quick snapshot of key metrics
echo "===== System Health Check ====="
echo "Time: $(date '+%Y-%m-%d %H:%M:%S')"
# CPU
echo "[CPU]"
mpstat 1 1 | tail -1 | awk '{printf "User: %.1f%% System: %.1f%% IOWait: %.1f%% Idle: %.1f%%
", $3, $5, $6, $12}'
# Memory
echo "[Memory]"
free -h | awk 'NR==2 {printf "Total: %s Used: %s Available: %s
", $2, $3, $7}'
# Disk I/O
echo "[Disk IO]"
iostat -xz 1 1 | awk '/^[sv]d|^nvme/ {printf "%s: util=%.1f%% await=%.1fms
", $1, $NF, $10}'
# Load
echo "[Load]"
cat /proc/loadavg | awk '{printf "1min: %s 5min: %s 15min: %s
", $1, $2, $3}'Backup and Restore of Monitoring Data
Prometheus data backup script
#!/bin/bash
# prometheus_backup.sh – TSDB snapshot and compression
BACKUP_DIR="/backup/prometheus"
DATA_DIR="/var/lib/prometheus/data"
DATE=$(date +%Y%m%d)
# Create snapshot via API
curl -X POST http://localhost:9090/api/v1/admin/tsdb/snapshot
# Determine latest snapshot directory
SNAPSHOT=$(ls -t ${DATA_DIR}/snapshots/ | head -1)
# Compress snapshot
tar -czf ${BACKUP_DIR}/prometheus_${DATE}.tar.gz -C ${DATA_DIR}/snapshots ${SNAPSHOT}
# Remove backups older than 7 days
find ${BACKUP_DIR} -name "prometheus_*.tar.gz" -mtime +7 -delete
# Clean up snapshot directory
rm -rf ${DATA_DIR}/snapshots/${SNAPSHOT}
echo "Backup completed: prometheus_${DATE}.tar.gz"Restore procedure
# 1. Stop Prometheus service
systemctl stop prometheus
# 2. Extract backup
tar -xzf /backup/prometheus/prometheus_20260115.tar.gz -C /var/lib/prometheus/data/
# 3. Verify data integrity
promtool tsdb analyze /var/lib/prometheus/data/
# 4. Restart service
systemctl start prometheus
# 5. Confirm Prometheus is serving data
curl -s http://localhost:9090/api/v1/query?query=up | jq .Conclusion
Technical Takeaways
CPU monitoring must differentiate user, system, iowait and steal time; high iowait signals I/O bottlenecks.
Memory health relies on MemAvailable rather than MemFree; any swap activity indicates pressure.
Disk I/O health is judged by average latency ( await) and utilization ( %util); NVMe devices require combined IOPS and throughput assessment.
Toolchain hierarchy: quick checks with vmstat, iostat, free; deep analysis with perf or eBPF; production monitoring with Prometheus + Grafana.
Further Learning Paths
eBPF performance analysis – study "BPF Performance Tools" (Brendan Gregg) and experiment with bcc / bpftrace scripts.
Distributed tracing – explore OpenTelemetry, Jaeger or Zipkin to correlate metrics across services.
AIOps – research time‑series anomaly detection algorithms and build models on historical monitoring data for intelligent alerting.
References
Linux Kernel Documentation – procfs
Brendan Gregg’s Linux Performance – authoritative resource
Prometheus Documentation – official guide
Node Exporter GitHub – project homepage
sysstat Documentation – toolset reference
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
