Mastering Nginx for High‑Traffic: Proven Tuning Steps for 10k+ QPS
This guide explains why the default Nginx configuration becomes a bottleneck under thousands of requests per second and provides a prioritized, production‑tested checklist of kernel, process, buffer, upstream, HTTP, and HA settings to dramatically improve throughput and latency.
Background and Problem
Nginx is a high‑performance HTTP server and reverse proxy, but its default settings are tuned for low‑traffic workloads. When QPS reaches several thousand or more, CPU usage stays low while requests queue, response times explode, and connections are refused.
1. Parameter Adjustment Overview and Priority
Adjustments follow a clear priority order: operating‑system kernel limits → Nginx worker process model → per‑connection memory buffers → upstream keep‑alive settings → HTTP‑level optimizations such as gzip. Change one layer, observe the effect, then move to the next.
2. Kernel Parameter Tuning
2.1 File‑Descriptor Limits
Each Nginx connection consumes a file descriptor (FD). The default 1024 is far too low for high concurrency.
# View system‑wide max FD
cat /proc/sys/fs/file-max
# View current FD usage
cat /proc/sys/fs/file-nr
# View Nginx soft limit
cat /proc/$(cat /var/run/nginx.pid)/limits | grep "Max open files"Increase the limits:
# /etc/sysctl.conf
fs.file-max = 1000000
# Apply
sysctl -p
# /etc/security/limits.conf
nginx soft nofile 100000
nginx hard nofile 100000
# Nginx config
worker_rlimit_nofile 100000;2.2 Network Kernel Parameters
# /etc/sysctl.conf additions
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15000
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.ip_local_port_range = 1024 65535
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216Apply with sysctl -p. The most important values are net.core.somaxconn (listen backlog) and net.ipv4.tcp_max_syn_backlog (SYN queue).
3. Nginx Process Model
3.1 Worker Processes
Set the number of workers to match CPU cores:
worker_processes auto;‘auto’ lets Nginx detect the core count. Over‑provisioning adds context‑switch overhead.
3.2 Worker Connections
events {
worker_connections 65535;
use epoll;
multi_accept on;
}use epoll is the most efficient I/O model on Linux. multi_accept on allows a worker to accept many connections at once. The theoretical maximum connections equal worker_connections * worker_processes and must stay below worker_rlimit_nofile.
3.3 CPU Affinity
worker_cpu_affinity auto;‘auto’ binds each worker to a distinct core; manual masks can be used on NUMA servers.
3.4 Worker Priority
worker_priority -10;Negative nice values give Nginx higher CPU priority when other services share the host.
4. Buffer Configuration
4.1 Client Buffers
http {
client_header_buffer_size 4k;
large_client_header_buffers 4 32k;
client_body_buffer_size 128k;
client_max_body_size 100m;
}Adjust sizes based on typical header size, cookie volume, and upload requirements.
4.2 Upstream Buffers
upstream backend {
server 127.0.0.1:8080;
keepalive 32;
}
proxy_buffer_size 128k;
proxy_buffers 4 128k;
proxy_buffering on;
proxy_busy_buffers_size 256k;Enabling proxy_buffering lets Nginx read upstream responses asynchronously, reducing worker blocking.
4.3 FastCGI Buffers (PHP‑FPM)
fastcgi_buffer_size 64k;
fastcgi_buffers 4 64k;
fastcgi_busy_buffers_size 128k;
fastcgi_temp_file_write_size 256k;
fastcgi_connect_timeout 60s;
fastcgi_send_timeout 60s;
fastcgi_read_timeout 60s;Increase timeouts if PHP scripts run long.
5. Timeout Settings
http {
keepalive_timeout 65;
client_header_timeout 15s;
client_body_timeout 15s;
send_timeout 30s;
}Balance keep‑alive duration against resource consumption; short client timeouts protect against slow‑loris attacks.
6. Compression and Transfer Optimizations
6.1 Gzip
http {
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied any;
gzip_comp_level 4;
gzip_types text/plain text/css text/xml application/json application/javascript application/xml application/xml+rss;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_disable "MSIE [1-6]\.";
}Level 4 offers a good CPU‑to‑compression ratio.
6.2 Static Resource Caching
location ~* \.(jpg|jpeg|png|gif|ico|css|js|woff|woff2|ttf|eot)$ {
expires 30d;
add_header Cache-Control "public, no-transform";
access_log off;
}6.3 Sendfile Zero‑Copy
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
}Reduces kernel‑user‑space copies for static files.
7. Connection Handling
7.1 HTTP/2
http {
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers 'TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256';
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
}
server {
listen 443 ssl http2;
server_name example.com;
}7.2 Upstream Keep‑Alive
upstream backend {
server 127.0.0.1:8080;
keepalive 64;
keepalive_requests 10000;
keepalive_timeout 60s;
}
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
}7.3 Rate Limiting and Connection Limits
limit_req_zone $binary_remote_addr zone=req_limit:10m rate=1000r/s;
limit_conn_zone $binary_remote_addr zone=conn_limit:10m;
server {
location / {
limit_req zone=req_limit burst=2000 nodelay;
limit_conn conn_limit 100;
proxy_pass http://backend;
}
}8. Caching
8.1 Proxy Cache
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=api_cache:100m max_size=10g inactive=60m use_temp_path=off;
server {
location /api/ {
proxy_pass http://backend;
proxy_cache api_cache;
proxy_cache_valid 200 60s;
proxy_cache_valid 404 10s;
proxy_cache_use_stale error timeout updating;
add_header X-Cache-Status $upstream_cache_status;
}
}8.2 Cache Purge
# Requires ngx_cache_purge module
location ~ /purge(/.*) {
proxy_cache_purge api_cache $1;
}
# Example purge call
# curl -X PURGE http://example.com/purge/api/users/1239. Load‑Balancing Strategies
9.1 Round‑Robin & Weighted
upstream backend {
server 127.0.0.1:8080 weight=5;
server 127.0.0.1:8081 weight=3;
server 127.0.0.1:8082 weight=2;
}9.2 Least Connections
upstream backend {
least_conn;
server 127.0.0.1:8080;
server 127.0.0.1:8081;
}9.3 IP Hash
upstream backend {
ip_hash;
server 127.0.0.1:8080;
server 127.0.0.1:8081;
server 127.0.0.1:8082;
}10. High Availability & Disaster Recovery
10.1 Passive Health Checks
upstream backend {
server 127.0.0.1:8080 max_fails=3 fail_timeout=30s;
server 127.0.0.1:8081 max_fails=3 fail_timeout=30s;
server 127.0.0.1:8082 max_fails=3 fail_timeout=30s;
}10.2 Backup & Down
upstream backend {
server 127.0.0.1:8080 weight=5;
server 127.0.0.1:8081 weight=3;
server 127.0.0.1:8082 backup;
}10.3 Active‑Passive VIP with Keepalived
# Keepalived configuration (simplified)
virtual_server 192.168.1.100 443 {
delay_loop 6
lb_algo rr
lb_kind VRRP
protocol TCP
real_server 192.168.1.101 443 { weight 1; TCP_CHECK { connect_timeout 3; nb_get_retry 3; delay_before_retry 3; } }
real_server 192.168.1.102 443 { weight 1; TCP_CHECK { connect_timeout 3; nb_get_retry 3; delay_before_retry 3; } }
}11. Production Verification Checklist
Validate Nginx syntax: nginx -t Confirm FD limit: ulimit -n shows ≥ 100000
Check worker count: ps aux | grep nginx shows multiple workers
Verify listening ports: ss -tlnp | grep nginx Ensure upstream connection pool is active (netstat/ss)
Test rate limiting with ab or wrk Confirm cache hits via X-Cache-Status header
Compare pre‑ and post‑tuning QPS and P99 latency
12. Common Pitfalls
12.1 Worker Processes
Setting more workers than CPU cores adds context‑switch overhead without performance gain.
12.2 Worker Connections
Do not set worker_connections equal to the system FD limit; keep it around 60‑70 % of worker_rlimit_nofile to leave room for logs, upstream sockets, and cache files.
12.3 Gzip Level
Level 4–5 balances compression ratio and CPU cost; higher levels give diminishing returns.
12.4 Proxy Buffering
For real‑time streams (SSE, long‑polling) disable proxy_buffering to avoid latency.
13. Conclusion
High‑traffic Nginx optimization follows a layered approach: raise kernel limits first, then tune the worker model, buffers, protocol features, and finally caching/compression. Regularly review metrics (QPS, P99 latency, error rate) because traffic patterns and upstream performance evolve, making continuous re‑tuning essential.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
