Operations 26 min read

Master Linux Performance: Proven Monitoring & Tuning Techniques to Boost System Speed 300%

This comprehensive guide shares a seasoned sysadmin's proven Linux performance monitoring and tuning methods—including CPU, memory, disk I/O, and network optimization, real‑world case studies, and ready‑to‑run shell scripts—so you can transform from a firefighting engineer into a performance expert.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master Linux Performance: Proven Monitoring & Tuning Techniques to Boost System Speed 300%

Linux Performance Optimization: Comprehensive System Monitoring and Tuning Techniques

Three years of seasoned sysadmin experience: From rookie to expert, these performance tuning tricks boosted my system performance by 300%.

Preface: The True Value of Performance Tuning

As a frontline operations engineer, I have been woken up at 2 am by alerts when CPU spikes to 100 %, memory runs out, or disk I/O becomes a bottleneck. Every ops professional knows this anxiety.

In this article I will share the most effective Linux performance monitoring and tuning techniques from real‑world practice, turning you from a "firefighter" into a "performance expert".

1. Golden Rules of Performance Monitoring

Monitoring the Four Key Dimensions

Before any optimization, establish a complete monitoring system. The four dimensions are:

1. CPU Performance Monitoring

# Real‑time view of CPU usage
top -p $(pgrep -d ',' your_process_name)

# Detailed CPU statistics
sar -u 1 10

# Per‑process CPU usage
pidstat -u -p PID 1

Key metrics: %usr: User‑space CPU usage %sys: Kernel‑space CPU usage %iowait: CPU time waiting for I/O %idle: CPU idle percentage

If %iowait stays above 20 % it usually indicates a disk I/O bottleneck.

2. Memory Performance Monitoring

# Show memory usage details
free -h

# Real‑time memory changes
watch -n 1 'free -h'

# Top memory‑hungry processes
ps aux --sort=-%mem | head -10

Important indicators:

Available memory : the most critical metric, not the "Free" column

Buffer/Cache usage : Linux caches intelligently; this memory can be reclaimed

Swap usage : once swap is used, performance drops sharply

3. Disk I/O Monitoring

# I/O statistics
iostat -x 1 5

# Real‑time I/O activity
iotop

# Filesystem usage
df -h

Critical alert thresholds: %util > 80: Disk may become a bottleneck await > 10ms: I/O response time too long r/s + w/s > 1000: IOPS too high, needs optimization

4. Network Performance Monitoring

# Show network connections
ss -tuln

# Monitor traffic
iftop

# Network statistics
cat /proc/net/dev

Key actions include establishing a baseline for each metric and comparing against it.

2. CPU Performance Tuning in Practice

Three Powerful CPU Tuning Techniques

1. Adjust Process Priority

# Lower priority of CPU‑intensive task
nice -n 19 your_cpu_intensive_command

# Change priority of a running process
renice -n 10 -p PID

# Real‑time priority adjustment (recommended)
ionice -c3 -p PID

Example: a backup job caused CPU usage to jump to 90 %; setting its nice value to 19 reduced usage to 30 % and restored normal response times.

2. Set CPU Affinity

# View process CPU affinity
taskset -cp PID

# Bind process to specific cores
taskset -cp 0,1 PID

# Start program with affinity
taskset -c 0-3 your_program

Additional strategies:

Bind network interrupts to specific CPU cores

Bind applications to other cores

Avoid frequent migration between cores

3. Interrupt Optimization

# View interrupt distribution
cat /proc/interrupts

# Manually set NIC interrupt affinity
echo 2 > /proc/irq/24/smp_affinity

# Enable irqbalance for automatic optimization
systemctl enable irqbalance
systemctl start irqbalance

Validate CPU Tuning Effects

# Stress test
stress-ng --cpu 4 --timeout 60s

# Compare before/after data
sar -u 1 10 > after_optimization.log

3. Memory Optimization Secrets

Four‑Step Memory Tuning

Step 1: Memory Usage Analysis

# Detailed memory info
cat /proc/meminfo

# Top memory‑consuming processes
ps aux --sort=-%mem | head -20

# Shared memory usage
ipcs -m

Step 2: Swap Optimization

# Current swap usage
swapon -s

# Reduce swap tendency (important!)
echo 10 > /proc/sys/vm/swappiness

# Permanent setting
echo 'vm.swappiness = 10' >> /etc/sysctl.conf

Recommended swappiness values:

Database servers: 1‑5

Web servers: 10‑20

General servers: 10‑30

Step 3: Memory Reclamation Strategies

# Release caches (emergency)
echo 3 > /proc/sys/vm/drop_caches

# Optimize allocation policy
echo 0 > /proc/sys/vm/overcommit_memory
echo 50 > /proc/sys/vm/overcommit_ratio

# Permanent settings
cat >> /etc/sysctl.conf <<EOF
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
EOF

Step 4: Huge Pages Optimization

# View huge page status
cat /proc/meminfo | grep -i huge

# Set huge pages
echo 1024 > /proc/sys/vm/nr_hugepages

# Permanent setting
echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.conf

Memory Leak Detection

# Use valgrind to detect leaks
valgrind --tool=memcheck --leak-check=full your_program

# Monitor memory trend
while true; do
    ps -o pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -10
    echo "---"
    sleep 5
done

4. Disk I/O Optimization in Practice

Three‑Blade IO Tuning

1. Filesystem Choice and Options

# Recommended mount options
mount -o noatime,nodiratime,barrier=0 /dev/sdb1 /data

# Permanent entry
echo '/dev/sdb1 /data ext4 defaults,noatime,nodiratime,barrier=0 0 0' >> /etc/fstab

Performance comparison (based on tests):

XFS : Best for large files, ideal for data storage

EXT4 : Best compatibility, excellent for small‑to‑medium files

Btrfs : Feature‑rich but average performance, suited for special needs

2. I/O Scheduler Optimization

# View current scheduler
cat /sys/block/sda/queue/scheduler

# Set scheduler (SSD example)
echo noop > /sys/block/sda/queue/scheduler

# Permanent setting
echo 'echo noop > /sys/block/sda/queue/scheduler' >> /etc/rc.local

Scheduler recommendations:

SSD: noop or deadline HDD: cfq or deadline Virtualized:

noop

3. Disk Parameter Tuning

# Adjust read‑ahead
blockdev --setra 4096 /dev/sda

# Queue depth
echo 32 > /sys/block/sda/queue/nr_requests

# Disable power‑saving mode
hdparm -B 255 /dev/sda

IO Monitoring Script

#!/bin/bash
while true; do
    clear
    echo "=== Disk I/O Real‑time Monitoring ==="
    echo "Time: $(date)"
    iostat -x 1 1 | grep -E "(Device|sd)"
    echo "=== Top IO Processes ==="
    iotop -b -n1 -a | head -15
    sleep 2
done

5. Network Performance Optimization Secrets

Core TCP Parameter Tweaks

# Append to /etc/sysctl.conf
cat >> /etc/sysctl.conf <<EOF
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 65536 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
EOF
sysctl -p

Network Interrupt Optimization

# View interrupt distribution
cat /proc/interrupts | grep eth0

# Set interrupt CPU affinity
echo 2 > /proc/irq/24/smp_affinity

# Enable multi‑queue NIC
ethtool -L eth0 combined 4

Firewall Optimization

# Prioritize common rules
iptables -I INPUT 1 -p tcp --dport 80 -j ACCEPT

# Use ipset for large IP lists
ipset create blacklist hash:ip
ipset add blacklist 192.168.1.100
iptables -A INPUT -m set --match-set blacklist src -j DROP

6. Comprehensive Performance Tuning Case Study

Background

Page response time grew from 200 ms to 5 s

CPU usage stayed above 90 %

Frequent DB query timeouts

User complaints surged

Analysis Process

# System-wide analysis
top -c
sar -u -r -b 1 10

# DB analysis
mysqladmin processlist
show full processlist;

# Network connections
ss -tuln | wc -l
netstat -an | grep TIME_WAIT | wc -l

Optimization Measures & Effects

CPU

# Lower MySQL priority
renice -10 $(pgrep mysqld)
# Bind MySQL to cores 0‑3
taskset -cp 0-3 $(pgrep mysqld)
# Reduce Apache workers
# MaxRequestWorkers 400 → 200

Result: CPU usage dropped from 90 % to 60 %. Memory <code># Increase InnoDB buffer pool # innodb_buffer_pool_size = 8G → 12G # Reduce swap usage echo 5 > /proc/sys/vm/swappiness</code> Result: DB query time reduced by 40 %. Disk I/O <code># Change scheduler to deadline echo deadline > /sys/block/sda/queue/scheduler # Optimize mount options mount -o remount,noatime,nodiratime /dev/sda1 /var/lib/mysql</code> Result: I/O wait fell from 30 % to 5 %.

Final outcomes:

Page response time: 5 s → 300 ms

System load: 4.5 → 1.2

User satisfaction markedly improved

Successfully handled double the concurrent traffic

7. Automation Scripts

One‑Click Performance Check

#!/bin/bash
# Linux performance one‑click report
echo "================== Linux Performance Check =================="
echo "Check Time: $(date)"
echo "Hostname: $(hostname)"
echo "Kernel: $(uname -r)"

# CPU
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
CPU_CORES=$(nproc)
LOAD_1=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | tr -d ',')

echo "CPU cores: $CPU_CORES"
echo "CPU usage: ${CPU_USAGE}%"
echo "1‑min load: $LOAD_1"
if (( $(echo "$LOAD_1 > $CPU_CORES" | bc -l) )); then
    echo "⚠️ Warning: High system load!"
fi

# Memory
TOTAL=$(free -m | awk 'NR==2{print $2}')
USED=$(free -m | awk 'NR==2{print $3}')
AVAILABLE=$(free -m | awk 'NR==2{print $7}')
MEM_USAGE=$(echo "scale=1; $USED*100/$TOTAL" | bc)

echo "Total Memory: ${TOTAL}MB"
echo "Used: ${USED}MB (${MEM_USAGE}%)"
echo "Available: ${AVAILABLE}MB"
if (( $(echo "$MEM_USAGE > 80" | bc -l) )); then
    echo "⚠️ Warning: High memory usage!"
fi

# Disk
echo "【Disk Performance】"
df -h | grep '^/dev/' | while read line; do
    USAGE=$(echo $line | awk '{print $5}' | tr -d '%')
    MOUNT=$(echo $line | awk '{print $6}')
    echo "$line"
    if [ "$USAGE" -gt 85 ]; then
        echo "⚠️ Warning: $MOUNT disk usage high ($USAGE%)!"
    fi
done

# Network
echo "【Network Connections】"
EST=$(ss -an | grep ESTAB | wc -l)
TIMEWAIT=$(ss -an | grep TIME-WAIT | wc -l)

echo "Established connections: $EST"
echo "TIME_WAIT connections: $TIMEWAIT"
if [ "$TIMEWAIT" -gt 5000 ]; then
    echo "⚠️ Warning: Too many TIME_WAIT connections!"
fi

# Top resource‑hungry processes
echo "【Top 10 CPU Processes】"
ps aux --sort=-%cpu | head -11 | tail -10

echo "【Top 10 Memory Processes】"
ps aux --sort=-%mem | head -11 | tail -10

echo "================== Check Complete =================="

Performance Report Generator

#!/bin/bash
REPORT_DATE=$(date +%Y%m%d_%H%M%S)
REPORT_FILE="/tmp/performance_report_${REPORT_DATE}.html"
cat > $REPORT_FILE <<EOF
<!DOCTYPE html>
<html>
<head>
    <title>Linux Performance Report</title>
    <style>
        body {font-family: Arial, sans-serif; margin: 20px;}
        .warning {color: #ff6b6b; font-weight: bold;}
        .normal {color: #51cf66;}
        .info {color: #339af0;}
        table {border-collapse: collapse; width: 100%;}
        th, td {border: 1px solid #ddd; padding: 8px; text-align: left;}
        th {background-color: #f2f2f2;}
    </style>
</head>
<body>
    <h1>🚀 Linux Performance Report</h1>
    <p>Generated at: $(date)</p>
    <p>Hostname: $(hostname)</p>
    <h2>📊 System Overview</h2>
    <table>
        <tr><th>Metric</th><th>Current Value</th><th>Status</th></tr>
        <tr><td>CPU Usage</td><td>$(top -bn1 | grep "Cpu(s)" | awk '{print $2}')%</td><td class="normal">Normal</td></tr>
        <tr><td>Memory Usage</td><td>$(free | awk 'NR==2{printf "%.1f%%", $3*100/$2}')</td><td class="normal">Normal</td></tr>
        <tr><td>System Load</td><td>$(uptime | awk -F'load average:' '{print $2}')</td><td class="info">Monitoring</td></tr>
    </table>
    <p>💡 <strong>Optimization Tip</strong>: Run regular checks, establish baselines, and address bottlenecks promptly.</p>
</body>
</html>
EOF

echo "Report generated: $REPORT_FILE"

Custom Alert Script

#!/bin/bash
CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
LOAD_THRESHOLD=4

check_cpu() {
    CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
    if (( $(echo "$CPU > $CPU_THRESHOLD" | bc -l) )); then
        echo "CPU alert: $CPU% exceeds $CPU_THRESHOLD%" | mail -s "Server CPU Alert" [email protected]
    fi
}

check_mem() {
    MEM=$(free | awk 'NR==2{printf "%.1f", $3*100/$2}')
    if (( $(echo "$MEM > $MEM_THRESHOLD" | bc -l) )); then
        echo "Memory alert: $MEM% exceeds $MEM_THRESHOLD%" | mail -s "Server Memory Alert" [email protected]
    fi
}

main() {
    check_cpu
    check_mem
    # Additional checks can be added here
}

main

8. Advanced Tuning Techniques

Kernel Parameter Optimization

# /etc/sysctl.conf high‑performance settings
cat >> /etc/sysctl.conf <<'EOF'
# Network
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_congestion_control = bbr

# Filesystem
fs.file-max = 1048576
fs.nr_open = 1048576

# Process
kernel.pid_max = 4194304

# Memory
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
vm.overcommit_memory = 1
EOF
sysctl -p

Process Limits

# /etc/security/limits.conf
cat >> /etc/security/limits.conf <<'EOF'
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
* soft memlock unlimited
* hard memlock unlimited
EOF

9. Performance Optimization Best Practices

The "Ten‑Point Scripture"

Monitoring First : No data, no direction.

Establish Baselines : Know normal metric values.

Iterative Tuning : Change one parameter at a time.

Validate Results : Record before/after data.

Rollback Plan : Always have a safe revert.

Document Changes : Keep detailed records.

Regular Review : Periodically assess impact.

Automation : Script common checks and fixes.

Knowledge Sharing : Share lessons within the team.

Continuous Learning : Stay updated with new tools and techniques.

Performance Check Checklist

Daily :

Is system load normal?

Is memory usage within limits?

Is disk space sufficient?

Are critical services running?

Weekly :

Review performance trends.

Check logs for anomalies.

Validate backup and restore.

Update baseline data.

Monthly :

Full performance assessment.

Capacity planning adjustments.

Fine‑tune parameters.

Disaster‑recovery drills.

Conclusion: From "Firefighter" to "Performance Expert"

Performance tuning is a continuous journey, not a one‑off task. Build solid monitoring, establish baselines, iterate carefully, verify improvements, keep rollback options, document everything, review regularly, automate where possible, share knowledge, and keep learning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuningSysadminShell Scripts
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.