Master Linux System Performance: Top, htop, vmstat, iostat & Advanced Tuning Secrets
This comprehensive guide walks you through Linux system performance monitoring using tools like top, htop, vmstat, iostat, sar, netstat, ps, and free, explains each metric, provides practical shell scripts for real‑time analysis, alerts, and detailed tuning strategies for CPU, memory, disk, and network.
System Performance Monitoring Overview
Effective performance monitoring is essential for maintaining service stability, enabling preventive maintenance, rapid fault isolation, capacity planning, and cost optimization.
CPU performance : usage, load average, interrupt time, context switches.
Memory performance : usage, cache, swap activity, leak detection.
Disk I/O performance : read/write speed, IOPS, queue length, error rate.
Network performance : bandwidth, latency, packet loss, connection count.
1. top Command Details
1.1 Basic Syntax
top [options]1.2 Common Options
# Basic usage
top
# Set refresh interval (seconds)
top -d 2
# Set number of iterations
top -n 5
# Batch mode (suitable for scripts)
top -b
# Show processes of a specific user
top -u username
# Show specific PIDs
top -p 1234,56781.3 Interface Explanation
System information line shows current time, uptime, number of users and load averages.
Tasks line displays total, running, sleeping, stopped and zombie processes.
CPU line provides percentages for user, system, nice, idle, I/O wait, hardware and software interrupts, and stolen time.
Memory line shows total, free, used, and buffered/cache memory in MiB, plus swap details.
1.4 Field Definitions
CPU: us (user), sy (system), id (idle), wa (I/O wait), hi (hardware interrupt), si (software interrupt), st (stolen).
Memory: total, free, used, buff/cache, avail Mem.
Swap: swpd, si, so.
1.5 Interactive Operations
h– help q – quit Space – immediate refresh k – kill process (requires PID) r – renice process f – add/remove fields o – change field order z – toggle colors
Sorting: P (CPU), M (memory), T (time), N (PID)
Filtering: u (user), n (number of processes), i (ignore idle/zombie)
1.6 Advanced Usage
Batch mode for scripting:
# Output to file
top -b -n 1 > system_status.txt
# Show specific process
top -b -n 1 -p 1234
# Example monitoring script
#!/bin/bash
while true; do
echo "=== $(date) ===" >> monitor.log
top -b -n 1 | head -20 >> monitor.log
sleep 60
done2. htop Command Details
2.1 Installation
# Ubuntu/Debian
sudo apt-get install htop
# CentOS/RHEL
sudo yum install htop
# Or
sudo dnf install htop
# Build from source
wget https://github.com/htop-dev/htop/archive/main.tar.gz
tar -xzf main.tar.gz
cd htop-main
./autogen.sh && ./configure && make && sudo make install2.2 Features
Colorful, intuitive display.
Mouse support.
Process tree view.
Real‑time CPU and memory bar graphs.
Horizontal scrolling for full command lines.
2.3 Interactive Keys
# Basic keys
F1 – help
F2 – setup
F3 – search
F4 – filter
F5 – toggle tree view
F6 – sort
F7 – decrease nice (higher priority)
F8 – increase nice (lower priority)
F9 – kill process
F10 – quit
# Advanced keys
Space – mark process
U – unmark all
c – mark process and children
K – hide kernel threads
H – hide user threads
p – toggle full path display3. vmstat Command Details
3.1 Basic Syntax
vmstat [options] [interval] [count]3.2 Common Options
# Show average since boot
vmstat
# Every 2 seconds, 5 times
vmstat 2 5
# Show active/inactive memory
vmstat -a
# Show disk stats
vmstat -d
# Show per‑device stats
vmstat -p /dev/sda1
# Show detailed stats
vmstat -s
# Show memory details
vmstat -m3.3 Output Fields
procs : r (runnable), b (blocked).
memory : swpd, free, buff, cache.
swap : si (swap‑in), so (swap‑out).
io : bi (blocks in), bo (blocks out).
system : in (interrupts), cs (context switches).
cpu : us, sy, id, wa, st.
3.4 Practical Scripts
Performance monitoring script:
#!/bin/bash
echo "=== System Performance Report $(date) ==="
# Basic info
vmstat 1 1
# Memory details
vmstat -s | grep -E "(total|used|free|buffer|cache|swap)"
# Continuous monitoring (5 min)
vmstat 5 60 > vmstat_$(date +%Y%m%d_%H%M).log
echo "Data saved to vmstat_$(date +%Y%m%d_%H%M).log"Alert script (CPU, memory, swap thresholds):
#!/bin/bash
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
SWAP_THRESHOLD=10
# CPU usage
cpu_usage=$(vmstat 1 2 | tail -1 | awk '{print 100-$15}')
if (( $(echo "$cpu_usage > $CPU_THRESHOLD" | bc -l) )); then
echo "Warning: CPU usage high – $cpu_usage%"
fi
# Memory usage
mem_total=$(free | awk '/Mem/ {print $2}')
mem_used=$(free | awk '/Mem/ {print $3}')
mem_usage=$((mem_used*100/mem_total))
if [ $mem_usage -gt $MEMORY_THRESHOLD ]; then
echo "Warning: Memory usage high – $mem_usage%"
fi
# Swap usage
swap_used=$(vmstat -s | awk '/used swap/ {print $1}')
if [ $swap_used -gt $SWAP_THRESHOLD ]; then
echo "Warning: Swap usage abnormal – ${swap_used}KB"
fi4. iostat Command Details
4.1 Installation
# Ubuntu/Debian
sudo apt-get install sysstat
# CentOS/RHEL
sudo yum install sysstat
# Or
sudo dnf install sysstat4.2 Basic Syntax
iostat [options] [interval] [count]4.3 Common Options
# Basic display
iostat
# Extended stats
iostat -x
# CPU stats only
iostat -c
# Disk stats only
iostat -d
# Network FS stats
iostat -n
# Human‑readable
iostat -h
# Every 2 seconds, 5 times
iostat -x 2 54.4 Output Interpretation
CPU statistics (percentages): user, nice, system, iowait, steal, idle.
Disk statistics fields include r/s, w/s, rkB/s, wkB/s, rrqm/s, wrqm/s, %rrqm, %wrqm, r_await, w_await, aqu‑sz, rareq‑sz, wareq‑sz, svctm, %util.
4.5 Key Metrics
IOPS : r/s + w/s (focus on random I/O).
Throughput : rkB/s and wkB/s (sequential I/O).
Response time : r_await, w_await, svctm (ideal: await ≈ svctm).
Utilization : %util (≈100 % means saturation).
Queue length : aqu‑sz (high value indicates backlog).
4.6 Example Disk Monitoring Script
#!/bin/bash
LOG_FILE="/var/log/disk_performance.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$DATE] Disk performance monitoring start" >> $LOG_FILE
# Check utilization
iostat -x 1 1 | grep -E "(Device|sda|sdb)" >> $LOG_FILE
# High‑util warning
high_util=$(iostat -x 1 1 | awk 'NR>3 && $NF>80 {print $1, $NF}')
if [ -n "$high_util" ]; then
echo "[$DATE] Warning: High‑util disks:" >> $LOG_FILE
echo "$high_util" >> $LOG_FILE
fi
# High latency warning
high_latency=$(iostat -x 1 1 | awk 'NR>3 && $10>20 {print $1, $10}')
if [ -n "$high_latency" ]; then
echo "[$DATE] Warning: High latency disks:" >> $LOG_FILE
echo "$high_latency" >> $LOG_FILE
fi
echo "[$DATE] Disk performance monitoring end" >> $LOG_FILE5. Other Important Monitoring Tools
5.1 sar (System Activity Reporter)
# CPU usage
sar -u 2 5
# Memory usage
sar -r 2 5
# Disk I/O
sar -d 2 5
# Network statistics
sar -n DEV 2 5
# Load average
sar -q 2 5
# Swap usage
sar -S 2 5Historical data can be viewed with sar -u -f /var/log/sysstat/sa15 or by specifying time ranges.
5.2 netstat
# All connections
netstat -a
# Listening ports
netstat -l
# TCP only
netstat -t
# UDP only
netstat -u
# Show process IDs
netstat -p
# Numeric output
netstat -n
# Combined view
netstat -tulnp5.3 ps
# Current terminal processes
ps
# All processes
ps aux
# Process tree
ps auxf
# User‑specific
ps -u username
# Specific PID
ps -p 1234
# Full format
ps -ef5.4 free
# Basic display
free
# Human‑readable
free -h
# MB units
free -m
# Continuous monitoring every 2 s
free -s 2
# Detailed view
free -w6. Advanced Monitoring Tools
6.1 atop
# Ubuntu/Debian
sudo apt-get install atop
# CentOS/RHEL
sudo yum install atop
# Run interactively
atop
# Interval of 2 s
atop 2
# Show specific modules
atop -d # disk
atop -m # memory
atop -n # network
atop -p # processesHistorical logs can be read with atop -r /var/log/atop/atop_20231215 -b 09:00 -e 18:00 and exported via atop -P CPU,MEM,DSK -r ... > analysis.txt.
6.2 nmon
# Install
sudo apt-get install nmon # Debian/Ubuntu
sudo yum install nmon # CentOS/RHEL
# Run interactively
nmon
# Quick keys: c (CPU), m (memory), d (disk I/O), n (network), t (processes), r (resources), q (quit)Data collection example (30 s interval, 120 samples):
nmon -f -s 30 -c 120 # saves to *.nmon file
# Convert to CSV/Excel
nmon2rrd filename.nmon6.3 dstat
# Install
sudo apt-get install dstat # Debian/Ubuntu
sudo yum install dstat # CentOS/RHEL
# Basic display
dstat
# CPU, memory, disk, network
dstat -cdmn
# Top CPU and memory consuming processes
dstat --top-cpu --top-mem
# Specific disks or interfaces
dstat -d -D sda,sdb
dstat -n -N eth0,eth1
# Custom interval (5 s, 12 times)
dstat 5 127. Performance Tuning Strategies
7.1 CPU Tuning
# View CPU info
lscpu
cat /proc/cpuinfo
# Current usage
top -p 1 -n 1 | grep "Cpu(s)"
vmstat 1 5
# Set CPU affinity
taskset -c 0,1 command
taskset -p 0x3 PID
# Adjust nice value
nice -n 10 command
renice -n 5 -p PID
# Enable performance governor
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor7.2 Load Balancing
# Current load
uptime
cat /proc/loadavg
# Diagnose high load: CPU‑bound, I/O‑wait, too many processes
# Optimisation suggestions:
# • Distribute CPU‑intensive jobs
# • Optimise I/O (use faster disks, tune scheduler)
# • Limit concurrent processes7.3 Memory Tuning
# Detailed memory info
cat /proc/meminfo
free -h
vmstat -s
# Drop caches (1 = pagecache, 2 = dentries/inodes, 3 = both)
echo 1 > /proc/sys/vm/drop_caches
echo 2 > /proc/sys/vm/drop_caches
echo 3 > /proc/sys/vm/drop_caches
# Adjust swap usage
swapon -s
swapoff /swapfile
mkswap /swapfile
swapon /swapfile
# Reduce swappiness (default 60)
echo 10 > /proc/sys/vm/swappiness
# Overcommit control
echo 1 > /proc/sys/vm/overcommit_memory7.4 Memory Leak Detection
# Monitor top memory consumers
ps aux --sort=-%mem | head -10
# Inspect a process
pmap -x PID
# Use valgrind for native binaries
valgrind --tool=memcheck --leak-check=full ./program
# Live monitoring
watch -n 1 'cat /proc/meminfo | grep -E "(MemTotal|MemFree|MemAvailable|Buffers|Cached)"'7.5 Disk I/O Tuning
# Disk layout
lsblk
fdisk -l
df -h
# Change I/O scheduler (deadline, noop, cfq)
echo deadline > /sys/block/sda/queue/scheduler
# Adjust I/O priority
ionice -c 1 -n 4 command # real‑time
ionice -c 2 -n 7 command # best‑effort
ionice -c 3 command # idle
# Filesystem mount options
mount -o noatime,nodiratime /dev/sda1 /mnt
# ext4 writeback mode
tune2fs -o journal_data_writeback /dev/sda1
# XFS noatime, nobarrier
mount -o noatime,nodiratime,nobarrier /dev/sda1 /mnt7.6 RAID Optimisation
# RAID status
cat /proc/mdstat
mdadm --detail /dev/md0
# Create RAID0 with 64 KB stripe
mdadm --create /dev/md0 --level=0 --raid-devices=2 --chunk=64 /dev/sda1 /dev/sdb1
# Adjust read‑ahead
blockdev --setra 8192 /dev/sda7.7 Network Tuning
# Interface status
ip addr show
ethtool eth0
# Increase socket buffers
echo 'net.core.rmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 87380 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 16777216' >> /etc/sysctl.conf
sysctl -p
# Enable TCP window scaling and modern congestion control
echo 'net.ipv4.tcp_window_scaling = 1' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_congestion_control = cubic' >> /etc/sysctl.conf
sysctl -p7.8 Sample Monitoring Scripts
Comprehensive system performance script (generates a timestamped report with CPU, memory, disk, network and process snapshots):
#!/bin/bash
LOG_DIR="/var/log/performance"
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $LOG_DIR
REPORT=$LOG_DIR/report_$DATE.txt
echo "=== System Performance Report $(date) ===" > $REPORT
# CPU
echo "1. CPU Usage:" >> $REPORT
top -b -n 1 | head -5 >> $REPORT
vmstat 1 3 >> $REPORT
# Memory
echo "2. Memory Usage:" >> $REPORT
free -h >> $REPORT
vmstat -s | grep -E "(total|used|free|buffer|cache)" >> $REPORT
# Disk
echo "3. Disk Usage:" >> $REPORT
df -h >> $REPORT
iostat -x 1 3 >> $REPORT
# Network
echo "4. Network Status:" >> $REPORT
netstat -i >> $REPORT
ss -tuln >> $REPORT
# Processes
echo "5. Top Processes:" >> $REPORT
ps aux --sort=-%cpu | head -10 >> $REPORT
ps aux --sort=-%mem | head -10 >> $REPORT
echo "Report saved to $REPORT"Performance alert script (email notifications for CPU, memory, disk and load thresholds):
#!/bin/bash
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
DISK_THRESHOLD=85
LOAD_THRESHOLD=5.0
MAIL_TO="[email protected]"
HOSTNAME=$(hostname)
# CPU
cpu=$(top -b -n1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
if (( $(echo "$cpu > $CPU_THRESHOLD" | bc -l) )); then
echo "Warning: $HOSTNAME CPU high $cpu%" | mail -s "CPU Alert" $MAIL_TO
fi
# Memory
mem=$(free | awk '/Mem/ {printf "%.1f", $3*100/$2}')
if (( $(echo "$mem > $MEMORY_THRESHOLD" | bc -l) )); then
echo "Warning: $HOSTNAME Memory high $mem%" | mail -s "Memory Alert" $MAIL_TO
fi
# Disk (root partition)
usage=$(df -h / | awk 'NR==2 {gsub(/%/,"",$5); print $5}')
if [ $usage -gt $DISK_THRESHOLD ]; then
echo "Warning: $HOSTNAME Disk usage $usage%" | mail -s "Disk Alert" $MAIL_TO
fi
# Load average (1‑minute)
load=$(awk '{print $1}' /proc/loadavg)
if (( $(echo "$load > $LOAD_THRESHOLD" | bc -l) )); then
echo "Warning: $HOSTNAME Load average $load" | mail -s "Load Alert" $MAIL_TO
fi8. Conclusion
Linux system performance monitoring and tuning are core skills for operations engineers. By mastering tools such as top, htop, vmstat, iostat, sar, netstat, ps and free, you can quickly identify bottlenecks, implement preventive maintenance, and perform targeted optimisations across CPU, memory, disk and network layers.
Key Takeaways
Understand and interpret core metrics for CPU, memory, I/O and network.
Use shell scripts to automate data collection, reporting and alerting.
Apply tuning techniques: CPU affinity, scheduler selection, memory swappiness, filesystem mount options, I/O priority and TCP stack parameters.
Build a continuous monitoring and alerting pipeline (7×24 h) and document all changes.
Further Learning
Deep dive into Linux kernel internals and system calls.
Explore enterprise‑grade monitoring platforms like Prometheus, Zabbix or Nagios.
Automate remediation with configuration management tools (Ansible, Chef, Puppet).
Study advanced storage technologies (SSD, NVMe, RAID levels) and network optimisation (TCP tuning, offloading).
Stay updated with the latest open‑source monitoring utilities and best‑practice guides.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
