Operations 13 min read

10 Essential Linux Kernel Tweaks to Supercharge System Performance

This guide walks through ten critical Linux kernel parameters—explaining why defaults can cripple performance, showing real‑world problem scenarios, providing exact sysctl commands and application‑level adjustments, and culminating in a full e‑commerce flash‑sale case study with measurable results and safety precautions.

Ops Community
Ops Community
Ops Community
10 Essential Linux Kernel Tweaks to Supercharge System Performance

Linux Kernel Parameter Tuning: 10 Key Configurations to Boost System Performance

Opening Hook

At 2 AM an alarm rang: Load Average spiked over 200 while CPU usage stayed at 30%. Restarting services and scaling machines did nothing until a seemingly minor kernel parameter vm.dirty_ratio was tuned, instantly restoring stability. Many similar "performance killers" exist; here are ten hard‑won kernel tuning lessons.

Why Default Kernel Parameters Become Performance Killers?

Problem Essence Analysis

Linux defaults aim for universality , trying to suit everything from Raspberry Pi to supercomputers—like a one‑size‑fits‑all garment: it fits, but not comfortably.

In an e‑commerce system during Double 11, the P99 latency hit 3 seconds. CPU, memory, and disk I/O looked normal, yet the bottleneck was the default net.core.somaxconn value of 128, which rejected many connections under high concurrency.

The truth: In specific business scenarios, default parameters can be the biggest performance bottleneck.

10 Key Kernel Parameters Deep Dive and Practice

1. TCP Connection Queue Optimization: net.core.somaxconn

Problem Scenario: Nginx logs show many "connection refused" messages despite ample server resources.

# 查看当前值
sysctl net.core.somaxconn
net.core.somaxconn = 128  # 默认值太小!

# 查看TCP全连接队列溢出次数
netstat -s | grep -i listen
    21567 times the listen queue of a socket overflowed

Optimization Solution:

# 临时生效
sudo sysctl -w net.core.somaxconn=65535

# 永久生效
echo "net.core.somaxconn = 65535" >> /etc/sysctl.conf

# 同时需要调整应用层,如 Nginx
events {
    worker_connections 65535;
}
Key Tip: Adjusting kernel parameters alone is insufficient; the application layer must be tuned in sync, otherwise the application will still limit performance.

2. Memory Dirty Page Tuning: vm.dirty_ratio and vm.dirty_background_ratio

Problem Scenario: System I/O wait spikes intermittently, causing database write jitter.

# 默认配置(可能导致IO风暴)
vm.dirty_background_ratio = 10  # 脏页达到10%开始后台刷盘
vm.dirty_ratio = 20            # 脏页达到20%开始同步刷盘

# 优化配置(适合数据库服务器)
sudo sysctl -w vm.dirty_background_ratio=5
sudo sysctl -w vm.dirty_ratio=10

# 配合脏页过期时间调整
sudo sysctl -w vm.dirty_expire_centisecs=1000  # 10秒

Measured Effect: Average write‑performance test time dropped from 45.3 s to 32.1 s, smoothing out write latency.

3. TCP Fast Reuse and Recycle: net.ipv4.tcp_tw_reuse

Problem Scenario: The server, acting as a client, creates many connections; TIME_WAIT connections exceed ten thousand.

# 查看TIME_WAIT连接数
ss -ant | grep TIME-WAIT | wc -l
15847

# 优化配置
sudo sysctl -w net.ipv4.tcp_tw_reuse=1
sudo sysctl -w net.ipv4.tcp_fin_timeout=15  # 默认60秒太长

Note: tcp_tw_recycle is not recommended in NAT environments.

4. File Descriptor Limits: fs.file-max and fs.nr_open

# 系统级限制
sudo sysctl -w fs.file-max=2000000
sudo sysctl -w fs.nr_open=2000000

# 进程级限制(/etc/security/limits.conf)
* soft nofile 1000000
* hard nofile 1000000

# 验证生效
ulimit -n
1000000

5. Network Buffer Optimization: net.core.rmem_max and net.core.wmem_max

Applicable Scenario: High Bandwidth‑Delay Product (BDP) networks, such as cross‑region transfers.

# 计算BDP:带宽(Mbps) * RTT(ms) / 8
# 示例:1 Gbps 带宽,RTT 100 ms → BDP = 12500 KB

# 优化配置
sudo sysctl -w net.core.rmem_max=134217728  # 128 MB
sudo sysctl -w net.core.wmem_max=134217728

# TCP缓冲区自动调优
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"

6. SYN Queue Protection: net.ipv4.tcp_max_syn_backlog

# 防止SYN Flood攻击
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=8192
sudo sysctl -w net.ipv4.tcp_syncookies=1
sudo sysctl -w net.ipv4.tcp_synack_retries=2

7. Process Scheduling Optimization: kernel.sched_migration_cost_ns

# 减少CPU缓存失效,提升性能
sudo sysctl -w kernel.sched_migration_cost_ns=5000000  # 5 ms

# 配合NUMA亲和性优化
numactl --hardware  # 查看NUMA拓扑
taskset -c 0-7 ./your-application  # 绑定CPU

8. Memory Swapping Control: vm.swappiness

# 数据库服务器建议设置为1‑10
sudo sysctl -w vm.swappiness=10

# 完全禁用swap(需谨慎)
swapoff -a

9. TCP Keepalive Optimization

# 更快检测死连接
sudo sysctl -w net.ipv4.tcp_keepalive_time=600   # 10 分钟
sudo sysctl -w net.ipv4.tcp_keepalive_intvl=30   # 30 秒
sudo sysctl -w net.ipv4.tcp_keepalive_probes=3   # 3次探测

10. Kernel Semaphore Settings: kernel.sem

# 适合数据库服务器(如 PostgreSQL)
sudo sysctl -w kernel.sem="500 2048000 200 4096"  # 格式:SEMMSL SEMMNS SEMOPM SEMMNI

Case Study: E‑commerce Flash‑Sale System Optimization

Background

The platform expected 100 k QPS but only achieved 30 k under load, with P99 latency over 2 s.

Optimization Process

#!/bin/bash
# One‑click tuning script tune_kernel.sh

echo "=== 开始内核参数调优 ==="

# 1. 网络优化
cat >> /etc/sysctl.conf <<EOF
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.netdev_max_backlog = 8192
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
EOF

# 2. 文件系统优化
cat >> /etc/sysctl.conf <<EOF
fs.file-max = 2000000
fs.nr_open = 2000000
EOF

# 3. 内存优化
cat >> /etc/sysctl.conf <<EOF
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.swappiness = 10
EOF

# 4. 应用配置
sysctl -p

echo "=== 调优完成,开始验证 ==="

Optimization Results

Key metrics before and after tuning:

QPS: 30 k → 120 k (300% increase)

P99 latency: 2000 ms → 200 ms (90% ↓)

CPU utilization: 30% → 75% (150% increase)

TIME_WAIT connections: >50 k → <5 k (90% ↓)

Optimization Boundaries and Precautions

When Not to Tune?

Virtualization limits: Cloud VMs may restrict certain parameters.

Insufficient memory: Enlarging buffers can backfire.

Unstable network: Aggressive timeout settings may raise failure rates.

Monitoring and Rollback Mechanism

# 参数调优监控脚本
# 保存原始配置
sysctl -a > /tmp/sysctl_backup_$(date +%Y%m%d).conf

# 监控关键指标
watch -n 1 '
    echo "=== TCP连接状态 ==="
    ss -ant | awk "{print $1}" | sort | uniq -c
    echo "=== 内存使用 ==="
    free -h
    echo "=== Load Average ==="
    uptime
'

# 异常时回滚(示例)
# sysctl -p /tmp/sysctl_backup_20240315.conf

Core Takeaways

Immediate performance gains: Proper tuning can deliver 30%‑300% improvements.

Prevention beats cure: Early tuning avoids production incidents.

One‑time tuning, long‑term benefit: Establishes a reusable configuration template.

Understand the why, not just the what: Grasping underlying principles outweighs memorizing values.

Monitoring‑driven continuous optimization: Build a quantitative evaluation system for tuning effects.

Remember: there is no silver bullet, only trade‑offs. Adjust each parameter based on your actual business scenario; never copy blindly.
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationLinuxNetworkingSysadminKernel Tuning
Ops Community
Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.