Unlock Nginx Performance: Proven Strategies to Boost Throughput and Concurrency
This article presents a comprehensive guide to Nginx performance tuning, covering methodology, request lifecycle, application‑level tweaks, and system‑level optimizations to achieve higher concurrency, lower latency, and better resource utilization.
Speaker: Tao Hui, former Huawei and Tencent data‑infrastructure engineer, author of *Deep Understanding of Nginx: Module Development and Architecture Analysis*, now CTO and co‑founder at Zhilianda, focusing on applying internet technology to transform the construction industry.
Today's talk focuses on systematic thinking about Nginx performance to help engineers improve efficiency.
1. Optimization Methodology
The presentation addresses two key problems:
Maintaining a high number of concurrent connections while using memory efficiently.
Ensuring high throughput under high concurrency.
Implementation focuses on three layers: application, framework, and kernel.
Hardware considerations include NIC speed (10G/40G), storage type (SSD vs HDD), and especially CPU performance.
Techniques such as reuseport, fastsocket, and coroutine‑based OpenResty reduce context‑switch costs and improve CPU utilization.
2. The Life Cycle of a Request
Understanding the request flow clarifies where to optimize.
Nginx modules form a processing pipeline; each request passes through a sequence of modules in the core (e.g., stream and NGX).
2.1 Request Arrival
When a new connection is accepted, the kernel places it in a queue, epoll waits for events, and Nginx allocates a connection memory pool that is released only when the connection closes.
A 60‑second timer is set to close idle connections; a read buffer of about 1 KB is allocated for the request body.
2.2 Request Parsing
The URI and headers are read into a request memory pool (default 4 KB). If the request is larger, Nginx expands the pool in 8 KB increments, keeping pointers to parsed data without freeing them until the request ends.
After header parsing, Nginx proceeds through 11 processing phases (post‑read, rewrite, access, etc.).
2.3 Reverse Proxy
For slow client connections, Nginx buffers the entire request (default 8 KB) before opening an upstream connection, reducing upstream load but increasing memory usage.
2.4 Response Generation
The response passes through header, write, postpone, and copy filters. OpenResty directives and SDK hooks can be inserted at appropriate stages.
3. Application‑Layer Optimizations
3.1 Protocol
Improving HTTP/2, header compression, and other protocol features can significantly boost performance, though they may introduce trade‑offs with security.
3.2 Compression
Balancing dynamic and static compression affects CPU load; keepalive connections also influence throughput due to slow‑start behavior.
3.4 Rate Limiting
Limiting the response rate to clients (not upstream) helps control bandwidth consumption and smooths traffic spikes.
3.5 Worker Load Balancing
Disabling the inter‑process lock can increase throughput but may cause uneven worker utilization; enabling the “requests” directive lets the kernel balance connections more evenly.
3.6 Timeouts
Nginx uses a red‑black tree to manage timers; proper timeout settings improve TCP resource reuse and reduce half‑open connections.
3.7 Caching
Spatial and temporal caching strategies (e.g., pre‑fetching adjacent data blocks) can reduce disk I/O and improve hit rates.
3.8 Reducing Disk I/O
Techniques such as sendfile zero‑copy, AIO, SSD usage, and thread‑pooled file reads can yield up to 9× performance gains.
4. System‑Level Optimizations
Key areas include increasing capacity limits, CPU cache affinity, NUMA‑aware memory placement, fast TCP recovery, and tuning kernel parameters such as TCP_DEFER_ACCEPT.
Optimizing TCP parameters (initial window size, retransmission timers) and leveraging multi‑queue NICs further improve throughput.
Memory allocation speed, PCRE version, and kernel‑level tweaks complete the performance checklist.
This article is compiled from Tao Hui’s presentation at GOPS 2018 Shanghai.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.