Operations 12 min read

Optimizing Nginx Load Balancing for Ultra‑High Concurrency: Kernel, NIC, and Parameter Tuning

This article details how to diagnose and resolve Nginx load‑balancing performance issues under extreme concurrency by analyzing NIC packet loss, kernel SYN drops, backlog limits, and applying systematic kernel, network‑card, and Nginx configuration optimizations that raise QPS from 20,000+ to over 40,000.

58 Tech
58 Tech
58 Tech
Optimizing Nginx Load Balancing for Ultra‑High Concurrency: Kernel, NIC, and Parameter Tuning

Background – An internal API service for 58 Group uses Nginx as a load balancer. During peak traffic the PHP curl client reports frequent “Connection timed out after xx” errors, while the Nginx server shows low CPU load, indicating network‑level timeouts rather than server overload.

Problem Analysis – The timeout originates from TCP three‑way‑handshake failures caused by packet loss. The article focuses on two main loss sources: (A) NIC overruns (including SYN and data packets) and (B) kernel SYN packet drops.

Solution 1: Fix NIC Overruns – Use ifconfig to inspect NIC statistics (overruns, dropped, errors). Bind each NIC interrupt to a dedicated CPU core to avoid interrupt congestion on core 0. Example command to view interrupts:

$ cat /proc/interrupts | grep eth | awk '{print $1,$NF}'
77: eth0-0
78: eth0-1
79: eth0-2
80: eth0-3
81: eth0-4
82: eth0-5
83: eth0-6
84: eth0-7

$ echo 0 > /proc/irq/77/smp_affinity_list
$ echo 1 > /proc/irq/78/smp_affinity_list
...

Solution 2: Resolve Kernel SYN Drops

2.1 – When TcpExtListenOverflows increases, the accept‑queue is full. Increase net.core.somaxconn and Nginx backlog (e.g., set both to 2048) and adjust tcp_max_syn_backlog to 4096.

# linux kernel
sysctl -w net.core.somaxconn=2048
sysctl -w net.ipv4.tcp_max_syn_backlog=4096

# nginx
server {
    listen 80 backlog=2048;
    ...
}

2.2 – When TcpExtListenDrops rises while TcpExtListenOverflows stays constant, memory allocation failures in the slab/buddy system cause SYN loss. Increase memory safety thresholds:

# enable zone reclaim
sysctl -w vm.zone_reclaim_mode=1
# raise free‑memory thresholds
sysctl -w vm.min_free_kbytes=512000
sysctl -w vm.extra_free_kbytes=1048576

2.3 – When TcpExtTCPBacklogDrop grows, multiple CPU cores contend for the same listen socket, forcing SYN packets into the backlog‑queue. Enable reuseport (kernel support required) so each core gets its own socket, reducing lock contention.

# nginx
server {
    listen 80 backlog=2048 rcvbuf=131072 sndbuf=131072 reuseport;
    ...
}

Optimization Summary

Bind NIC interrupts to separate CPU cores to eliminate overruns.

Increase backlog and net.core.somaxconn to handle more pending connections.

Raise vm.extra_free_kbytes and vm.min_free_kbytes to avoid buddy‑allocation failures.

Enable reuseport and enlarge rcvbuf / sndbuf to mitigate TcpExtTCPBacklogDrop .

After applying these kernel, NIC, and Nginx parameter adjustments, the observed QPS increased from over 20,000 to more than 40,000, demonstrating the effectiveness of systematic performance tuning.

Kernelload balancingPerformance Tuninghigh concurrencynetwork optimizationNginxReusePort
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.