Operations 39 min read

Master Linux System Monitoring: Deep Dive into CPU, Memory, and I/O Metrics

This comprehensive guide explains how to collect and analyze Linux system metrics—including CPU usage, memory consumption, disk I/O, and load average—using native /proc and /sys interfaces, popular command‑line tools, and Prometheus Node Exporter, with practical scripts, configuration examples, and troubleshooting case studies for reliable performance monitoring and capacity planning.

Ops Community
Ops Community
Ops Community
Master Linux System Monitoring: Deep Dive into CPU, Memory, and I/O Metrics

Overview

Linux exposes performance data through /proc and /sys. Tools such as vmstat, iostat, top, perf and eBPF‑based collectors read these virtual files to provide CPU, memory, disk I/O and network metrics. Understanding each metric and its collection method is essential for capacity planning, performance tuning, fault diagnosis and SLA‑driven alerting.

Technical Characteristics

Kernel‑level data source : All metrics are real‑time kernel data, offering high precision with minimal overhead.

Full‑stack coverage : CPU scheduling, memory management, block‑device I/O and network stack are all observable.

Mature toolchain : From basic utilities ( vmstat, iostat) to advanced profilers ( perf, eBPF).

Standardised output : POSIX‑compatible formats simplify integration with Prometheus, Grafana and other monitoring platforms.

Applicable Scenarios

Capacity planning : Analyse historical metrics to predict scaling needs.

Performance tuning : Identify CPU‑ or I/O‑intensive workloads and adjust system parameters.

Fault diagnosis : Correlate multi‑dimensional metrics to locate root causes quickly.

SLA assurance : Build alerting policies based on metric thresholds.

Environment Requirements

Operating System : CentOS 8+/Ubuntu 20.04+/Debian 11+ (kernel ≥ 4.15 for cgroup v2 and eBPF).

procps‑ng ≥ 4.0 – provides ps, top, vmstat, free.

sysstat ≥ 12.7 – provides iostat, mpstat, sar.

htop ≥ 3.3 – interactive process monitor.

iotop ≥ 0.6 – real‑time block I/O monitor.

perf – kernel‑matched version for low‑level profiling.

Detailed Steps

Preparation

System environment check

# Verify kernel version (features require ≥4.15)
uname -r
# Confirm procfs is mounted
mount | grep proc
# List block devices via sysfs
ls /sys/class/block/
# Detect cgroup version (v2 preferred)
cat /sys/fs/cgroup/cgroup.controllers 2>/dev/null && echo "cgroup v2" || echo "cgroup v1"

Install monitoring tools

RHEL/CentOS/Rocky Linux

# Install sysstat and basic utilities
sudo dnf install -y sysstat epel-release htop iotop perf
# Enable and start sysstat data collection
sudo systemctl enable --now sysstat

Debian/Ubuntu

# Update package index
sudo apt update
# Install the full suite
sudo apt install -y sysstat htop iotop linux-tools-common linux-tools-$(uname -r)
# Enable sysstat collection
sudo sed -i 's/ENABLED="false"/ENABLED="true"/' /etc/default/sysstat
sudo systemctl restart sysstat

Key virtual file locations

# CPU information
/proc/stat        # CPU time counters
/proc/loadavg     # Load average (1/5/15 min)
/proc/cpuinfo     # CPU hardware details

# Memory information
/proc/meminfo     # Detailed memory usage
/proc/vmstat      # Virtual memory statistics
/proc/buddyinfo   # Memory fragmentation

# Disk I/O information
/proc/diskstats   # Block device counters
/sys/block/*/stat # Per‑device I/O counters
/proc/io          # Process‑level I/O (root only)

# Network information
/proc/net/dev      # Interface statistics
/proc/net/snmp     # Protocol counters

CPU Monitoring Metrics

CPU time slice distribution

# Raw CPU statistics (first line of /proc/stat)
cat /proc/stat | head -1
# Formatted view per core (sample every second, 5 times)
mpstat -P ALL 1 5

%user : User‑mode time (0‑70 % normal). High values indicate compute‑intensive workloads.

%nice : Time for nice‑adjusted processes (0‑5 %).

%system : Kernel‑mode time (0‑30 %). High values may reveal excessive system calls.

%iowait : CPU waiting for I/O (0‑20 %). Persistent high values point to I/O bottlenecks.

%irq / %softirq : Interrupt handling (≤5 % and ≤10 % respectively). Spike in %softirq often correlates with high network traffic.

%steal : Time stolen by the hypervisor in virtualised environments (≤5 %).

%idle : Idle time (20‑100 %). Low idle combined with high iowait signals I/O saturation.

System load (Load Average)

# Show load averages
cat /proc/loadavg
# Example output: 0.52 0.48 0.45 2/1089 28754
# Evaluate against CPU core count
nproc   # or grep -c processor /proc/cpuinfo

Load < CPU cores – resources are sufficient.

Load ≈ CPU cores – system is fully loaded.

Load > CPU cores × 1.5 – contention, investigate.

Load > CPU cores × 2 – overload, response time may degrade.

Real‑time CPU usage

# Top (press 1 to expand per‑core view)
top -bn1 | head -20
# Htop for an interactive UI
htop
# Per‑process CPU usage
pidstat -u 1 5
# Sort processes by CPU consumption
ps aux --sort=-%cpu | head -10

Key fields in top output:

top - 10:30:45 up 45 days,  3:28,  2 users,  load average: 1.23, 0.98, 0.76
Tasks: 287 total,   2 running, 285 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.2 us,  2.1 sy,  0.0 ni, 91.5 id,  0.8 wa,  0.0 hi,  0.4 si,  0.0 st

Memory Monitoring Metrics

Memory overview

# Full memory snapshot
cat /proc/meminfo
# Human‑readable summary
free -h
# Continuous monitoring (refresh every 2 s)
watch -n 2 free -h

MemTotal : Physical memory size – baseline for usage percentages.

MemFree : Completely unused memory – not suitable for alerts.

MemAvailable : Estimable free memory (free + reclaimable cache) – recommended alert metric.

Buffers / Cached : Kernel buffers and page cache – reclaimable, not true consumption.

SwapTotal / SwapFree : Swap space – continuous decrease indicates memory pressure.

Memory usage calculation script

#!/bin/bash
# mem_usage.sh – compute actual used memory and percentage
MEM_TOTAL=$(awk '/MemTotal/ {print $2}' /proc/meminfo)
MEM_AVAILABLE=$(awk '/MemAvailable/ {print $2}' /proc/meminfo)
MEM_USED=$((MEM_TOTAL - MEM_AVAILABLE))
USAGE_PERCENT=$(echo "scale=2; $MEM_USED * 100 / $MEM_TOTAL" | bc)

echo "Total Memory: $((MEM_TOTAL/1024)) MB"
echo "Available Memory: $((MEM_AVAILABLE/1024)) MB"
echo "Used Memory: $((MEM_USED/1024)) MB"
echo "Memory Usage: ${USAGE_PERCENT}%"

vmstat memory fields swpd: Used swap (KB). free: Free physical memory (KB). buff: Kernel buffers (KB). cache: Page cache (KB). si / so: Swap‑in / swap‑out rates (KB/s) – non‑zero values signal memory pressure.

# Sample vmstat (1 s interval, 10 samples)
vmstat 1 10

Disk I/O Monitoring Metrics

iostat basic monitoring

# Extended mode, refresh every second
iostat -xz 1

r/s, w/s : Read/write requests per second – device‑dependent baseline.

rkB/s, wkB/s : Throughput in KB/s – watch for approaching device limits.

r_await, w_await : Average read/write latency (ms). Typical thresholds: HDD < 20 ms, SSD < 5 ms.

aqu-sz : Average request queue length – values > 2 indicate I/O saturation.

%util : Device utilization – sustained > 70 % suggests the need for optimisation or scaling.

Raw block device statistics

# Show per‑device counters
cat /proc/diskstats
# Example field order (sda):
# major minor name reads_completed reads_merged reads_sectors read_time_ms \
# writes_completed writes_merged writes_sectors write_time_ms current_io io_time_ms weighted_io_time_ms

Process‑level I/O with iotop

# Real‑time view (requires root)
iotop -o
# Batch mode, 5 samples, 1 s interval
iotop -b -n 5 -d 1
# Sort by I/O and show accumulated values
iotop -o -P -a

Configuration Examples

Prometheus Node Exporter service

# File: /etc/systemd/system/node_exporter.service
[Unit]
Description=Prometheus Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
After=network-online.target

[Service]
Type=simple
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter \
  --web.listen-address=:9100 \
  --web.telemetry-path=/metrics \
  --collector.filesystem.mount-points-exclude="^/(sys|proc|dev|host|etc)($$|/)" \
  --collector.netclass.ignored-devices="^(veth|docker|br-).*" \
  --collector.diskstats.device-exclude="^(ram|loop|fd|dm-).*" \
  --collector.cpu.info \
  --collector.meminfo \
  --collector.diskstats \
  --collector.netdev \
  --collector.loadavg \
  --collector.vmstat \
  --collector.filesystem \
  --collector.pressure \
  --no-collector.wifi \
  --no-collector.nvme \
  --no-collector.infiniband
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Prometheus alerting rules (node_alerts.yml)

# File: /etc/prometheus/rules/node_alerts.yml
groups:
- name: node_resource_alerts
  interval: 30s
  rules:
  - alert: HighCpuUsage
    expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage (instance: {{ $labels.instance }})"
      description: "CPU usage > 85 % for 5 min, current value: {{ $value | printf \"%.1f\" }}%"

  - alert: HighMemoryUsage
    expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High memory usage (instance: {{ $labels.instance }})"
      description: "Memory usage > 90 % for 5 min, current value: {{ $value | printf \"%.1f\" }}%"

  - alert: DiskSpaceLow
    expr: (1 - node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes) * 100 > 85
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "Low disk space (instance: {{ $labels.instance }})"
      description: "Filesystem {{ $labels.mountpoint }} usage > 85 %, current: {{ $value | printf \"%.1f\" }}%"

  - alert: HighDiskLatency
    expr: |
      rate(node_disk_read_time_seconds_total[5m]) / rate(node_disk_reads_completed_total[5m]) * 1000 > 50 or
      rate(node_disk_write_time_seconds_total[5m]) / rate(node_disk_writes_completed_total[5m]) * 1000 > 50
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High disk latency (instance: {{ $labels.instance }})"
      description: "Device {{ $labels.device }} latency > 50 ms"

  - alert: HighLoadAverage
    expr: node_load15 / count without(cpu, mode) (node_cpu_seconds_total{mode="idle"}) > 1.5
    for: 15m
    labels:
      severity: warning
    annotations:
      summary: "High load average (instance: {{ $labels.instance }})"
      description: "15‑min load per core > 1.5, current: {{ $value | printf \"%.2f\" }}"

  - alert: SwapUsageHigh
    expr: (node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes) / node_memory_SwapTotal_bytes * 100 > 50
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "High swap usage (instance: {{ $labels.instance }})"
      description: "Swap usage > 50 %, current: {{ $value | printf \"%.1f\" }}%"

Real‑World Cases

Case 1 – MySQL performance diagnosis

#!/bin/bash
# mysql_server_diagnosis.sh – quick snapshot for a MySQL host

echo "========== MySQL Server Performance Diagnosis =========="
echo "Time: $(date '+%Y-%m-%d %H:%M:%S')"

# Load
echo "
>>> System Load Average"
uptime
CORES=$(nproc)
LOAD1=$(awk '{print $1}' /proc/loadavg)
echo "CPU Cores: $CORES, Load/Core Ratio: $(echo "scale=2; $LOAD1/$CORES" | bc)"

# CPU distribution (5 s sample)
echo "
>>> CPU Usage Distribution"
mpstat 1 5 | tail -1

# Memory status
echo "
>>> Memory Status"
free -h

echo "MySQL Process Memory:"
ps aux | grep -E '^mysql|^USER' | head -2

# Disk I/O
echo "
>>> Disk I/O Status"
iostat -xz 1 3 | grep -E '^(Device|sd|nvme|vd)'

# Process I/O (requires root)
echo "
>>> MySQL Process I/O"
if [ $(id -u) -eq 0 ]; then
  iotop -b -n 3 -d 1 -P | grep -i mysql
else
  echo "Skip: requires root privilege"
fi

# Swap activity
echo "
>>> Swap Activity"
vmstat 1 5 | awk 'NR==1 || NR==2 || NR>2 {print}'

# Network connections
echo "
>>> MySQL Network Connections"
ss -tn state established '( dport = :3306 or sport = :3306 )' | wc -l

echo "========== Diagnosis Complete =========="

Typical findings (example):

CPU iowait ≈ 45 % → severe I/O wait.

Available memory ≈ 2 GB on a 62 GB system; swap already in use.

Disk utilization ≈ 89 % with write latency > 45 ms → storage saturation.

Root cause: insufficient RAM forces MySQL buffer‑pool eviction, leading to frequent disk reads.

Remediation: increase RAM to ≥ 128 GB or tune innodb_buffer_pool_size to fit the workload.

Case 2 – Web server CPU spike investigation

# Step 1: Overall CPU snapshot
top -bn1 | head -15

# Step 2: Sort processes by CPU usage
ps aux --sort=-%cpu | head -20

# Step 3: Count PHP‑FPM workers
echo "PHP‑FPM worker count: $(ps aux | grep 'php-fpm: pool' | wc -l)"

# Step 4: Show processes consuming > 50 % CPU
ps aux --sort=-%cpu | awk '$3>50 {print $0}'

# Step 5: Detailed per‑process CPU with pidstat
pidstat -u 1 10 | sort -k8 -rn | head -20

Further analysis with strace often reveals heavy regular‑expression processing in a specific API endpoint, confirming a ReDoS vulnerability.

Best Practices and Caveats

Monitoring Data Collection Optimisation

Balance precision and overhead by adjusting collection intervals.

# Sysstat collection every minute (production)
*/1 * * * * root /usr/lib64/sa/sa1 1 1
# Daily summary at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -A

Real‑time alert metrics: 10‑30 s interval.

Trend analysis: 1‑5 min interval.

Historical archiving: hourly or daily aggregation.

# Run Node Exporter with a minimal set of collectors to reduce load
node_exporter \
  --collector.disable-defaults \
  --collector.cpu \
  --collector.meminfo \
  --collector.diskstats \
  --collector.filesystem \
  --collector.loadavg \
  --collector.netdev

Alert Threshold Design

CPU usage : warning ≥ 80 %, critical ≥ 95 % (sustained 5 min).

Memory usage : based on MemAvailable, warning ≥ 85 %, critical ≥ 95 % (5 min).

Disk space : warning ≥ 80 %, critical ≥ 90 % (10 min).

Disk I/O latency : warning ≥ 30 ms, critical ≥ 100 ms (3 min).

System load : warning > CPU cores × 1.2, critical > CPU cores × 2 (10 min).

Swap usage : warning ≥ 30 %, critical ≥ 60 % (10 min).

Data Retention Strategy

# Prometheus storage configuration (30 days raw, 100 GB max)
global:
  scrape_interval: 15s
  evaluation_interval: 15s

storage:
  tsdb:
    path: /var/lib/prometheus/data
    retention.time: 30d
    retention.size: 100GB
# For long‑term storage, integrate Thanos for down‑sampling and > 1 year retention.

Fault Diagnosis and Performance Monitoring

Common Troubleshooting Flows

Log inspection

# Systemd journal (last 100 lines)
journalctl -xe --no-pager | tail -100
# Time‑range view
journalctl --since "2026-01-15 10:00:00" --until "2026-01-15 12:00:00"
# Kernel messages (OOM, hardware errors)
dmesg -T | tail -50
# OOM Killer records
dmesg -T | grep -i "out of memory"
journalctl -k | grep -i oom

CPU problem investigation

# Locate high‑CPU processes
top -bn1 -o %CPU | head -15
# Thread‑level view for a specific PID
top -H -p $(pgrep -f "process_name")
# Hotspot analysis with perf
perf top -p $(pgrep -f "process_name")

Memory problem investigation

# OOM event details
dmesg -T | grep -A 20 "Out of memory"
# Identify killed process
journalctl -k | grep -E "Killed process|oom-kill"
# Current memory pressure (PSI)
cat /proc/pressure/memory

I/O problem investigation

# Identify I/O wait source
iostat -xz 1 5
# Find processes generating I/O
iotop -o -b -n 5
# Per‑process I/O details
cat /proc/$(pgrep -f "process_name")/io

Comprehensive Health‑Check Script

#!/bin/bash
# system_health_check.sh – quick snapshot of key metrics

echo "===== System Health Check ====="
echo "Time: $(date '+%Y-%m-%d %H:%M:%S')"

# CPU
echo "[CPU]"
mpstat 1 1 | tail -1 | awk '{printf "User: %.1f%% System: %.1f%% IOWait: %.1f%% Idle: %.1f%%
", $3, $5, $6, $12}'

# Memory
echo "[Memory]"
free -h | awk 'NR==2 {printf "Total: %s Used: %s Available: %s
", $2, $3, $7}'

# Disk I/O
echo "[Disk IO]"
iostat -xz 1 1 | awk '/^[sv]d|^nvme/ {printf "%s: util=%.1f%% await=%.1fms
", $1, $NF, $10}'

# Load
echo "[Load]"
cat /proc/loadavg | awk '{printf "1min: %s 5min: %s 15min: %s
", $1, $2, $3}'

Backup and Restore of Monitoring Data

Prometheus data backup script

#!/bin/bash
# prometheus_backup.sh – TSDB snapshot and compression
BACKUP_DIR="/backup/prometheus"
DATA_DIR="/var/lib/prometheus/data"
DATE=$(date +%Y%m%d)

# Create snapshot via API
curl -X POST http://localhost:9090/api/v1/admin/tsdb/snapshot

# Determine latest snapshot directory
SNAPSHOT=$(ls -t ${DATA_DIR}/snapshots/ | head -1)

# Compress snapshot
tar -czf ${BACKUP_DIR}/prometheus_${DATE}.tar.gz -C ${DATA_DIR}/snapshots ${SNAPSHOT}

# Remove backups older than 7 days
find ${BACKUP_DIR} -name "prometheus_*.tar.gz" -mtime +7 -delete

# Clean up snapshot directory
rm -rf ${DATA_DIR}/snapshots/${SNAPSHOT}

echo "Backup completed: prometheus_${DATE}.tar.gz"

Restore procedure

# 1. Stop Prometheus service
systemctl stop prometheus

# 2. Extract backup
tar -xzf /backup/prometheus/prometheus_20260115.tar.gz -C /var/lib/prometheus/data/

# 3. Verify data integrity
promtool tsdb analyze /var/lib/prometheus/data/

# 4. Restart service
systemctl start prometheus

# 5. Confirm Prometheus is serving data
curl -s http://localhost:9090/api/v1/query?query=up | jq .

Conclusion

Technical Takeaways

CPU monitoring must differentiate user, system, iowait and steal time; high iowait signals I/O bottlenecks.

Memory health relies on MemAvailable rather than MemFree; any swap activity indicates pressure.

Disk I/O health is judged by average latency ( await) and utilization ( %util); NVMe devices require combined IOPS and throughput assessment.

Toolchain hierarchy: quick checks with vmstat, iostat, free; deep analysis with perf or eBPF; production monitoring with Prometheus + Grafana.

Further Learning Paths

eBPF performance analysis – study "BPF Performance Tools" (Brendan Gregg) and experiment with bcc / bpftrace scripts.

Distributed tracing – explore OpenTelemetry, Jaeger or Zipkin to correlate metrics across services.

AIOps – research time‑series anomaly detection algorithms and build models on historical monitoring data for intelligent alerting.

References

Linux Kernel Documentation – procfs

Brendan Gregg’s Linux Performance – authoritative resource

Prometheus Documentation – official guide

Node Exporter GitHub – project homepage

sysstat Documentation – toolset reference

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MetricsLinuxPrometheussysadmin
Ops Community
Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.