Boost Linux Server Throughput: Practical TCP/IP Stack Tuning Guide
This article presents a comprehensive, step‑by‑step guide for Linux network performance optimization, covering real‑world issues, sysctl parameter tweaks for TCP and IP stacks, queue and interrupt tuning, high‑concurrency strategies, monitoring scripts, a detailed e‑commerce case study, best‑practice recommendations, and safety precautions.
Introduction
In high‑concurrency, high‑traffic Internet environments, network performance often becomes the system bottleneck. The author, an experienced operations engineer, shares a complete Linux network performance tuning solution to eliminate these bottlenecks.
Common Performance Problems
High‑concurrency connections : during e‑commerce promotions, connection counts surge and many sockets linger in TIME_WAIT.
Large file transfers : backup tasks suffer from insufficient throughput.
Micro‑service calls : frequent inter‑service requests cause latency jitter and unstable response times.
These issues usually stem from default Linux kernel TCP/IP parameters that cannot satisfy high‑performance demands.
TCP Stack Core Parameter Optimization
1. TCP Connection Management
# /etc/sysctl.conf configuration file
# TCP connection queue length optimization
net.core.somaxconn = 65535 # increase listen queue length
net.core.netdev_max_backlog = 30000 # NIC receive queue length
net.ipv4.tcp_max_syn_backlog = 65535 # SYN queue length
# TIME_WAIT optimization
net.ipv4.tcp_tw_reuse = 1 # allow reuse of TIME_WAIT sockets
net.ipv4.tcp_fin_timeout = 30 # reduce FIN_WAIT_2 time
net.ipv4.tcp_max_tw_buckets = 10000 # limit TIME_WAIT count
# Keepalive settings
net.ipv4.tcp_keepalive_time = 600 # start keepalive probes after 10 min
net.ipv4.tcp_keepalive_probes = 3 # number of keepalive probes
net.ipv4.tcp_keepalive_intvl = 15 # interval between probes2. TCP Buffer Optimization
# TCP receive/send buffer optimization
net.core.rmem_default = 262144 # default receive buffer
net.core.rmem_max = 16777216 # max receive buffer
net.core.wmem_default = 262144 # default send buffer
net.core.wmem_max = 16777216 # max send buffer
# Automatic socket buffer scaling
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_mem = 94500000 915000000 927000000
# Enable TCP window scaling
net.ipv4.tcp_window_scaling = 13. TCP Congestion Control Optimization
# Choose congestion control algorithm
net.ipv4.tcp_congestion_control = bbr # recommended BBR algorithm
# Alternatives: cubic, reno, bic
# Fast retransmit and recovery
net.ipv4.tcp_frto = 2 # F‑RTO detects false timeouts
net.ipv4.tcp_dsack = 1 # enable DSACK
net.ipv4.tcp_fack = 1 # enable FACK congestion avoidance
# Disable slow start after idle
net.ipv4.tcp_slow_start_after_idle = 0IP Stack Parameter Optimization
1. IP Layer Processing
# IP forwarding and routing
net.ipv4.ip_forward = 0 # disable forwarding on non‑router hosts
net.ipv4.conf.default.rp_filter = 1 # enable reverse path filtering
net.ipv4.conf.all.rp_filter = 1
# IP fragmentation handling
net.ipv4.ipfrag_high_thresh = 262144 # high threshold
net.ipv4.ipfrag_low_thresh = 196608 # low threshold
net.ipv4.ipfrag_time = 30 # reassembly timeout
# ICMP optimizations
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 12. Port Range Optimization
# Expand local port range
net.ipv4.ip_local_port_range = 1024 65535
# UDP port optimizations
net.ipv4.udp_mem = 94500000 915000000 927000000
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192Network Queue and Interrupt Optimization
1. Device Queue Optimization
# Increase network device processing queues
echo 'echo 4096 > /proc/sys/net/core/netdev_budget' >> /etc/rc.local
echo 'echo 2 > /proc/sys/net/core/netdev_budget_usecs' >> /etc/rc.local
# RPS/RFS for multi‑core load balancing
echo 'f' > /sys/class/net/eth0/queues/rx-0/rps_cpus # adjust according to CPU count2. Interrupt Balancing Script
#!/bin/bash
# network_irq_balance.sh – network interrupt balancing
# Get IRQ numbers for eth0
IRQ_LIST=$(grep eth0 /proc/interrupts | awk -F: '{print $1}' | xargs)
CPU_COUNT=$(nproc)
i=0
for irq in $IRQ_LIST; do
cpu_mask=$((1 << (i % CPU_COUNT)))
printf "%x" $cpu_mask > /proc/irq/$irq/smp_affinity
echo "IRQ $irq -> CPU $((i % CPU_COUNT))"
((i++))
doneHigh‑Concurrency Scenario Optimizations
1. Large Connection Count
# File descriptor limits
echo '* soft nofile 1048576' >> /etc/security/limits.conf
echo '* hard nofile 1048576' >> /etc/security/limits.conf
# Process limits
echo '* soft nproc 1048576' >> /etc/security/limits.conf
echo '* hard nproc 1048576' >> /etc/security/limits.conf
# systemd service limits
echo 'DefaultLimitNOFILE=1048576' >> /etc/systemd/system.conf
echo 'DefaultLimitNPROC=1048576' >> /etc/systemd/system.conf2. Memory Management
# Virtual memory settings
vm.swappiness = 10 # reduce swap usage
vm.dirty_ratio = 15 # dirty page write‑back ratio
vm.dirty_background_ratio = 5 # background write‑back ratio
vm.overcommit_memory = 1 # allow memory overcommitPerformance Monitoring and Validation
1. Key Metric Monitoring Script
#!/bin/bash
# network_monitor.sh – network performance monitoring
echo "=== Network Connection Summary ==="
ss -s
echo -e "
=== TCP Connection State Distribution ==="
ss -tan | awk 'NR>1{state[$1]++} END{for(i in state) print i, state[i]}'
echo -e "
=== Network Throughput ==="
sar -n DEV 1 1 | grep -E "eth0|Average"
echo -e "
=== Memory Usage ==="
free -h
echo -e "
=== System Load ==="
uptime2. Stress Test Commands
# HTTP load test with wrk
wrk -t12 -c400 -d30s --latency http://your-server-ip/
# Bandwidth test with iperf3
iperf3 -s # server side
iperf3 -c server-ip -t 60 -P 10 # client side
# TCP connection count test
ab -n 100000 -c 1000 http://your-server-ip/Real‑World Case: E‑Commerce System Optimization
Before/after performance numbers demonstrate the impact of the tuning:
QPS increased from 15,000 to 45,000 (200 % improvement).
Average latency dropped from 120 ms to 35 ms (71 % reduction).
99 % latency reduced from 800 ms to 150 ms (81 % reduction).
Concurrent connections grew from 10,000 to 50,000 (400 % increase).
CPU usage fell from 85 % to 45 % (‑47 %).
Key Optimization Points
BBR congestion control : enabling BBR raised throughput by roughly 40 %.
TCP buffer tuning : significantly reduced latency jitter.
Connection reuse optimization : TIME_WAIT sockets decreased by about 90 %.
Interrupt balancing : improved multi‑core CPU utilization.
Best‑Practice Recommendations
1. Scenario‑Specific Tuning
High‑concurrency web servers
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535Large‑file transfer servers
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_window_scaling = 1Database servers
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_retries2 = 52. Production Deployment Workflow
Test environment verification.
Canary rollout on a subset of machines.
Continuous monitoring of key metrics.
Full deployment after validation.
3. Persist Configuration
# Apply all sysctl settings
sysctl -p
# Verify critical parameters
sysctl net.ipv4.tcp_congestion_control
sysctl net.core.somaxconn
# Ensure settings survive reboot
echo 'sysctl -p' >> /etc/rc.local
chmod +x /etc/rc.localPrecautions and Common Pitfalls
1. Parameter Tuning Misconceptions
Blindly enlarging buffers : may exhaust system memory.
Over‑optimizing TIME_WAIT : can lead to port exhaustion.
Ignoring business characteristics : different workloads require tailored settings.
2. Rollback Plan
# Backup current configuration
cp /etc/sysctl.conf /etc/sysctl.conf.backup.$(date +%Y%m%d)
# Quick rollback script
cat > /root/network_rollback.sh <<'EOF'
#!/bin/bash
cp /etc/sysctl.conf.backup.* /etc/sysctl.conf
sysctl -p
echo "Network config rollback completed!"
EOF
chmod +x /root/network_rollback.shConclusion
Understand business characteristics : select parameters that match workload needs.
Iterative tuning : modify a few settings at a time to simplify troubleshooting.
Continuous monitoring : maintain a robust observability stack to catch regressions early.
Test verification : run performance benchmarks after each change to confirm gains.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
