How to Supercharge Linux Performance: Diagnose CPU, Memory & Disk I/O Like a Pro
This comprehensive guide walks you through Linux system performance tuning—from CPU, memory, and disk I/O diagnostics to practical optimization techniques, key metric interpretation, real‑world case studies, monitoring scripts, and advanced tips—helping you boost performance by 30‑50%.
Linux System Performance Tuning: Full Diagnosis from CPU, Memory to Disk I/O
Key Points Preview: This article deeply analyzes the root causes of Linux system performance bottlenecks and provides actionable tuning solutions that can boost system performance by 30‑50%.
Core Thinking of Performance Tuning
Many ops engineers fall into the “treat the symptom” trap. Real tuning requires systematic thinking:
Performance Tuning Pyramid Model:
Top : Business metrics (response time, throughput)
Middle : System resources (CPU, memory, disk, network)
Bottom : Kernel parameters & hardware characteristics
CPU Performance Diagnosis and Tuning
1. The truth about CPU utilization
# Multi‑dimensional CPU usage observation
top -p $(pgrep -d ',' your_process_name)
htop
sar -u 1 10
# Deep analysis of CPU wait time
iostat -x 1
vmstat 1Key Metric Interpretation: %us: User‑space CPU usage, >70% needs attention %sy: System‑space CPU usage, >30% may indicate kernel bottleneck %wa: I/O wait time, >10% signals storage bottleneck %id: Idle time, <10% means the system is near full load
2. CPU binding optimization tricks
# View CPU topology
lscpu
cat /proc/cpuinfo | grep "physical id" | sort | uniq | wc -l
# Process CPU binding (avoid cache invalidation)
taskset -cp 0-3 PID
numactl --cpubind=0 --membind=0 your_command
# Interrupt binding optimization
echo 2 > /proc/irq/24/smp_affinityPractical case: An e‑commerce system reduced latency by 35% after CPU binding.
3. Context switch optimization
# Monitor context switches
vmstat 1 | awk '{print $12,$13}'
cat /proc/interrupts
pidstat -w 1
# Optimization strategies
echo 'kernel.sched_migration_cost_ns = 5000000' >> /etc/sysctl.conf
echo 'kernel.sched_autogroup_enabled = 0' >> /etc/sysctl.confMemory Management Deep Optimization
1. Memory usage pattern analysis
# Detailed memory analysis
free -h
cat /proc/meminfo
smem -t -k
# Process memory consumption inspection
ps aux --sort=-%mem | head -20
pmap -d PID
cat /proc/PID/smapsMemory Optimization Golden Rules:
Available memory < 20% of total → needs optimization
Swap usage > 10% → memory shortage signal
Cache hit rate < 95% → may need cache strategy adjustment
2. Swap optimization strategies
# Swap usage monitoring
swapon -s
cat /proc/swaps
# Smart swap tuning
echo 'vm.swappiness = 10' >> /etc/sysctl.conf
echo 'vm.vfs_cache_pressure = 50' >> /etc/sysctl.conf
echo 'vm.dirty_ratio = 15' >> /etc/sysctl.conf
echo 'vm.dirty_background_ratio = 5' >> /etc/sysctl.conf3. Huge page memory optimization
# Configure transparent huge pages
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
echo defer+madvise > /sys/kernel/mm/transparent_hugepage/defrag
# Static huge page configuration
echo 1024 > /proc/sys/vm/nr_hugepages
echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.confPerformance boost: In database scenarios, huge pages can improve performance by 15‑25%.
Disk I/O Performance Ultimate Optimization
1. I/O performance deep diagnosis
# I/O performance monitoring toolkit
iostat -x 1
iotop -o
dstat -d
blktrace /dev/sda
# Disk queue depth analysis
cat /sys/block/sda/queue/nr_requests
echo 256 > /sys/block/sda/queue/nr_requestsKey I/O Metrics: %util: Disk utilization, >80% needs optimization await: Average wait time, SSD < 10ms, HDD < 20ms svctm: Service time, should match actual disk access time r/s, w/s: Read/write IOPS, must meet business requirements
2. File system tuning
# ext4 tuning
mount -o noatime,nodiratime,barrier=0 /dev/sda1 /data
tune2fs -o journal_data_writeback /dev/sda1
# XFS tuning
mount -o noatime,nodiratime,logbufs=8,logbsize=256k /dev/sda1 /data
xfs_info /data3. I/O scheduler optimization
# View current I/O scheduler
cat /sys/block/sda/queue/scheduler
# SSD: use noop or deadline
echo noop > /sys/block/sda/queue/scheduler
# HDD: use cfq
echo cfq > /sys/block/sda/queue/scheduler
# Persist setting
echo 'echo noop > /sys/block/sda/queue/scheduler' >> /etc/rc.localSystem‑Level Performance Tuning Practice
1. Kernel parameter ultimate configuration
# Network optimization
echo 'net.core.rmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 87380 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 16777216' >> /etc/sysctl.conf
# File descriptor optimization
echo 'fs.file-max = 1000000' >> /etc/sysctl.conf
ulimit -n 1000000
# Process scheduling optimization
echo 'kernel.sched_min_granularity_ns = 2000000' >> /etc/sysctl.conf
echo 'kernel.sched_wakeup_granularity_ns = 3000000' >> /etc/sysctl.conf2. Performance monitoring script
#!/bin/bash
# One‑click performance monitor
while true; do
echo "=== $(date) ==="
echo "CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)%"
echo "MEM: $(free | grep Mem | awk '{printf \"%.2f%%\", $3/$2 * 100.0}')"
echo "DISK: $(iostat -x 1 1 | grep -v '^$' | tail -n +4 | awk '{print $1,$10}' | head -5)"
echo "LOAD: $(uptime | awk -F'load average:' '{print $2}')"
echo "---"
sleep 5
donePerformance Tuning Effect Quantification
Real‑world case analysis
Case 1: E‑commerce system
Before: response time 2.5s, CPU usage 85%
After: response time 0.8s, CPU usage 45%
Performance improvement: response time up 68%, resource utilization up 47%
Case 2: Database server
Before: QPS 1200, memory usage 90%
After: QPS 2100, memory usage 65%
Performance improvement: QPS up 75%, memory efficiency up 38%
Performance baseline establishment
# Baseline script
#!/bin/bash
LOGFILE="/var/log/performance_baseline.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
{
echo "[$DATE] Performance Baseline Check"
echo "CPU: $(grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$3+$4+$5)} END {print usage "%"}')"
echo "Memory: $(free | grep Mem | awk '{printf \"Used: %.1f%% Available: %.1fGB\", $3*100/$2, $7/1024/1024}')"
echo "Disk I/O: $(iostat -x 1 1 | awk '/^[a-z]/ {print $1":"$10"ms"}' | head -3)"
echo "Load Average: $(uptime | awk -F'load average:' '{print $2}')"
echo "Network: $(sar -n DEV 1 1 | grep Average | grep -v lo | awk '{print $2":"$5"KB/s in, "$6"KB/s out"}' | head -2)"
echo "=================================="
} >> $LOGFILEAdvanced Tuning Techniques
1. NUMA architecture optimization
# View NUMA info
numactl --hardware
numastat
cat /proc/buddyinfo
# NUMA binding strategy
numactl --cpubind=0 --membind=0 your_application
echo 1 > /proc/sys/kernel/numa_balancing2. Container environment performance optimization
# Docker resource limits
docker run --cpus="2.0" --memory="4g" --memory-swap="4g" your_app
# cgroup tuning
echo 1024 > /sys/fs/cgroup/cpu/docker/cpu.shares
echo 50000 > /sys/fs/cgroup/cpu/docker/cpu.cfs_quota_us3. Real‑time system tuning
# Real‑time kernel configuration
echo 'kernel.sched_rt_runtime_us = 950000' >> /etc/sysctl.conf
echo 'kernel.sched_rt_period_us = 1000000' >> /etc/sysctl.conf
# Process priority adjustment
chrt -f -p 99 PID
nice -n -20 your_critical_processPerformance Tuning Best Practices
1. Progressive optimization strategy
Establish performance baseline : record pre‑optimization metrics
Single‑point breakthrough : adjust one parameter at a time
Effect verification : thoroughly test the impact
Rollback preparation : keep original configuration
2. Monitoring and alert system
# Critical metric thresholds
CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
LOAD_THRESHOLD=5.0
# Auto‑alert script
if [ $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d'.' -f1) -gt $CPU_THRESHOLD ]; then
echo "CPU usage exceeds threshold" | mail -s "Performance Alert" [email protected]
fi3. Performance tuning checklist
Basic checks:
System load < CPU core count
Memory usage < 80%
Disk I/O wait < 20ms
Network connections within reasonable range
Advanced checks:
CPU cache hit rate optimization
NUMA affinity configuration
Interrupt load balancing
Kernel parameter validation
Conclusion and Outlook
🚀 Performance can improve 30‑50% with scientific tuning.
🎯 Precise bottleneck identification using multi‑dimensional diagnostics.
🛠️ All techniques validated in production environments.
📈 Build a complete monitoring system for continuous optimization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
