Operations 15 min read

How to Supercharge Linux Performance: Diagnose CPU, Memory & Disk I/O Like a Pro

This comprehensive guide walks you through Linux system performance tuning—from CPU, memory, and disk I/O diagnostics to practical optimization techniques, key metric interpretation, real‑world case studies, monitoring scripts, and advanced tips—helping you boost performance by 30‑50%.

MaGe Linux Operations

Jul 31, 2025

How to Supercharge Linux Performance: Diagnose CPU, Memory & Disk I/O Like a Pro

Linux System Performance Tuning: Full Diagnosis from CPU, Memory to Disk I/O

Key Points Preview: This article deeply analyzes the root causes of Linux system performance bottlenecks and provides actionable tuning solutions that can boost system performance by 30‑50%.

Core Thinking of Performance Tuning

Many ops engineers fall into the “treat the symptom” trap. Real tuning requires systematic thinking:

Performance Tuning Pyramid Model:

Top : Business metrics (response time, throughput)

Middle : System resources (CPU, memory, disk, network)

Bottom : Kernel parameters & hardware characteristics

CPU Performance Diagnosis and Tuning

1. The truth about CPU utilization

# Multi‑dimensional CPU usage observation
 top -p $(pgrep -d ',' your_process_name)
 htop
 sar -u 1 10

# Deep analysis of CPU wait time
 iostat -x 1
 vmstat 1

Key Metric Interpretation: %us: User‑space CPU usage, >70% needs attention %sy: System‑space CPU usage, >30% may indicate kernel bottleneck %wa: I/O wait time, >10% signals storage bottleneck %id: Idle time, <10% means the system is near full load

2. CPU binding optimization tricks

# View CPU topology
 lscpu
 cat /proc/cpuinfo | grep "physical id" | sort | uniq | wc -l

# Process CPU binding (avoid cache invalidation)
 taskset -cp 0-3 PID
 numactl --cpubind=0 --membind=0 your_command

# Interrupt binding optimization
 echo 2 > /proc/irq/24/smp_affinity

Practical case: An e‑commerce system reduced latency by 35% after CPU binding.

3. Context switch optimization

# Monitor context switches
 vmstat 1 | awk '{print $12,$13}'
 cat /proc/interrupts
 pidstat -w 1

# Optimization strategies
 echo 'kernel.sched_migration_cost_ns = 5000000' >> /etc/sysctl.conf
 echo 'kernel.sched_autogroup_enabled = 0' >> /etc/sysctl.conf

Memory Management Deep Optimization

1. Memory usage pattern analysis

# Detailed memory analysis
 free -h
 cat /proc/meminfo
 smem -t -k

# Process memory consumption inspection
 ps aux --sort=-%mem | head -20
 pmap -d PID
 cat /proc/PID/smaps

Memory Optimization Golden Rules:

Available memory < 20% of total → needs optimization

Swap usage > 10% → memory shortage signal

Cache hit rate < 95% → may need cache strategy adjustment

2. Swap optimization strategies

# Swap usage monitoring
 swapon -s
 cat /proc/swaps

# Smart swap tuning
 echo 'vm.swappiness = 10' >> /etc/sysctl.conf
 echo 'vm.vfs_cache_pressure = 50' >> /etc/sysctl.conf
 echo 'vm.dirty_ratio = 15' >> /etc/sysctl.conf
 echo 'vm.dirty_background_ratio = 5' >> /etc/sysctl.conf

3. Huge page memory optimization

# Configure transparent huge pages
 echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
 echo defer+madvise > /sys/kernel/mm/transparent_hugepage/defrag

# Static huge page configuration
 echo 1024 > /proc/sys/vm/nr_hugepages
 echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.conf

Performance boost: In database scenarios, huge pages can improve performance by 15‑25%.

Disk I/O Performance Ultimate Optimization

1. I/O performance deep diagnosis

# I/O performance monitoring toolkit
 iostat -x 1
 iotop -o
 dstat -d
 blktrace /dev/sda

# Disk queue depth analysis
 cat /sys/block/sda/queue/nr_requests
 echo 256 > /sys/block/sda/queue/nr_requests

Key I/O Metrics: %util: Disk utilization, >80% needs optimization await: Average wait time, SSD < 10ms, HDD < 20ms svctm: Service time, should match actual disk access time r/s, w/s: Read/write IOPS, must meet business requirements

2. File system tuning

# ext4 tuning
 mount -o noatime,nodiratime,barrier=0 /dev/sda1 /data
 tune2fs -o journal_data_writeback /dev/sda1

# XFS tuning
 mount -o noatime,nodiratime,logbufs=8,logbsize=256k /dev/sda1 /data
 xfs_info /data

3. I/O scheduler optimization

# View current I/O scheduler
 cat /sys/block/sda/queue/scheduler

# SSD: use noop or deadline
 echo noop > /sys/block/sda/queue/scheduler

# HDD: use cfq
 echo cfq > /sys/block/sda/queue/scheduler

# Persist setting
 echo 'echo noop > /sys/block/sda/queue/scheduler' >> /etc/rc.local

System‑Level Performance Tuning Practice

1. Kernel parameter ultimate configuration

# Network optimization
 echo 'net.core.rmem_max = 16777216' >> /etc/sysctl.conf
 echo 'net.core.wmem_max = 16777216' >> /etc/sysctl.conf
 echo 'net.ipv4.tcp_rmem = 4096 87380 16777216' >> /etc/sysctl.conf
 echo 'net.ipv4.tcp_wmem = 4096 65536 16777216' >> /etc/sysctl.conf

# File descriptor optimization
 echo 'fs.file-max = 1000000' >> /etc/sysctl.conf
 ulimit -n 1000000

# Process scheduling optimization
 echo 'kernel.sched_min_granularity_ns = 2000000' >> /etc/sysctl.conf
 echo 'kernel.sched_wakeup_granularity_ns = 3000000' >> /etc/sysctl.conf

2. Performance monitoring script

#!/bin/bash
# One‑click performance monitor
while true; do
  echo "=== $(date) ==="
  echo "CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)%"
  echo "MEM: $(free | grep Mem | awk '{printf \"%.2f%%\", $3/$2 * 100.0}')"
  echo "DISK: $(iostat -x 1 1 | grep -v '^$' | tail -n +4 | awk '{print $1,$10}' | head -5)"
  echo "LOAD: $(uptime | awk -F'load average:' '{print $2}')"
  echo "---"
  sleep 5
done

Performance Tuning Effect Quantification

Real‑world case analysis

Case 1: E‑commerce system

Before: response time 2.5s, CPU usage 85%

After: response time 0.8s, CPU usage 45%

Performance improvement: response time up 68%, resource utilization up 47%

Case 2: Database server

Before: QPS 1200, memory usage 90%

After: QPS 2100, memory usage 65%

Performance improvement: QPS up 75%, memory efficiency up 38%

Performance baseline establishment

# Baseline script
#!/bin/bash
LOGFILE="/var/log/performance_baseline.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
{
  echo "[$DATE] Performance Baseline Check"
  echo "CPU: $(grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$3+$4+$5)} END {print usage "%"}')"
  echo "Memory: $(free | grep Mem | awk '{printf \"Used: %.1f%% Available: %.1fGB\", $3*100/$2, $7/1024/1024}')"
  echo "Disk I/O: $(iostat -x 1 1 | awk '/^[a-z]/ {print $1":"$10"ms"}' | head -3)"
  echo "Load Average: $(uptime | awk -F'load average:' '{print $2}')"
  echo "Network: $(sar -n DEV 1 1 | grep Average | grep -v lo | awk '{print $2":"$5"KB/s in, "$6"KB/s out"}' | head -2)"
  echo "=================================="
} >> $LOGFILE

Advanced Tuning Techniques

1. NUMA architecture optimization

# View NUMA info
 numactl --hardware
 numastat
 cat /proc/buddyinfo

# NUMA binding strategy
 numactl --cpubind=0 --membind=0 your_application
 echo 1 > /proc/sys/kernel/numa_balancing

2. Container environment performance optimization

# Docker resource limits
 docker run --cpus="2.0" --memory="4g" --memory-swap="4g" your_app

# cgroup tuning
 echo 1024 > /sys/fs/cgroup/cpu/docker/cpu.shares
 echo 50000 > /sys/fs/cgroup/cpu/docker/cpu.cfs_quota_us

3. Real‑time system tuning

# Real‑time kernel configuration
 echo 'kernel.sched_rt_runtime_us = 950000' >> /etc/sysctl.conf
 echo 'kernel.sched_rt_period_us = 1000000' >> /etc/sysctl.conf

# Process priority adjustment
 chrt -f -p 99 PID
 nice -n -20 your_critical_process

Performance Tuning Best Practices

1. Progressive optimization strategy

Establish performance baseline : record pre‑optimization metrics

Single‑point breakthrough : adjust one parameter at a time

Effect verification : thoroughly test the impact

Rollback preparation : keep original configuration

2. Monitoring and alert system

# Critical metric thresholds
 CPU_THRESHOLD=80
 MEM_THRESHOLD=85
 DISK_THRESHOLD=90
 LOAD_THRESHOLD=5.0

# Auto‑alert script
 if [ $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d'.' -f1) -gt $CPU_THRESHOLD ]; then
   echo "CPU usage exceeds threshold" | mail -s "Performance Alert" [email protected]
 fi

3. Performance tuning checklist

Basic checks:

System load < CPU core count

Memory usage < 80%

Disk I/O wait < 20ms

Network connections within reasonable range

Advanced checks:

CPU cache hit rate optimization

NUMA affinity configuration

Interrupt load balancing

Kernel parameter validation

Conclusion and Outlook

🚀 Performance can improve 30‑50% with scientific tuning.

🎯 Precise bottleneck identification using multi‑dimensional diagnostics.

🛠️ All techniques validated in production environments.

📈 Build a complete monitoring system for continuous optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Tuning CPU memory system-administration disk I/O

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.