Why Nginx OOMs Under Million-Connection Load and How to Fix It
During a million‑connection WebSocket stress test, four 32‑core, 128 GB Nginx servers repeatedly ran out of memory, prompting an investigation that revealed oversized proxy buffers as the root cause and showed that disabling buffering and tuning buffer sizes stabilizes memory usage.
Phenomenon Description
This is a WebSocket stress‑test environment with millions of long‑lived connections. Clients (JMeter) run on hundreds of machines, traffic passes through four Nginx instances to backend services. When idle, memory is stable; once massive send/receive starts, each of the 32 worker processes consumes nearly 4 GB, and the system repeatedly OOM‑kills them.
[Fri Mar 13 18:46:44 2020] Out of memory: Kill process 28258 (nginx) score 30 or sacrifice child
[Fri Mar 13 18:46:44 2020] Killed process 28258 (nginx) total-vm:1092198764kB, anon-rss:3943668kB, file-rss:736kB, shmem-rss:4kBInvestigation Process
Using ss -nt on both Nginx and client sides showed a large number of ESTABLISHED connections with huge Send‑Q and Recv‑Q queues. Example output:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 792024 1.1.1.1:80 2.2.2.2:50664
...Packet captures on the JMeter client occasionally displayed many zero‑window events, suggesting the client could not keep up.
Memory dumps were taken early in the rise using pmap -x 4199, cat /proc/4199/smaps, and gdb to dump the relevant region. The dump contained a massive amount of request/response data.
pmap -x 4199 | sort -k 3 -n -r
00007f2340539000 475240 461696 461696 rw--- [ anon ]
...Inspecting Nginx configuration revealed an unusually large proxy_buffers setting:
location / {
proxy_pass http://xxx;
...
proxy_buffer_size 64M;
proxy_buffers 64 16M;
proxy_busy_buffers_size 256M;
proxy_temp_file_write_size 512M;
}Simulating Nginx Memory Rise
A slow‑receiving client was written in Go to mimic a bottleneck:
package main
import (
"bufio"
"fmt"
"net"
"time"
)
func main() {
conn, _ := net.Dial("tcp", "10.211.55.10:80")
text := "GET /demo.mp4 HTTP/1.1
Host: ya.test.me
"
fmt.Fprintf(conn, text)
for {
_, _ = bufio.NewReader(conn).ReadByte()
time.Sleep(time.Second * 3)
println("read one byte")
}
}Running this program while monitoring Nginx with pidstat -p pid -r 1 1000 showed memory jumping to ~450 MB within seconds and staying high. Launching two such clients caused memory to exceed 900 MB.
Solution
Because each connection was allocated a huge buffer, the total memory grew with the number of connections. The quickest fix is to turn off buffering and reduce buffer sizes: proxy_buffering off; After applying this change and lowering proxy_buffer_size, memory stabilized around 20 GB in the stress test and only grew by about 64 MB when the test was repeated.
When buffering is enabled, Nginx stores the upstream response in buffers set by proxy_buffer_size and proxy_buffers . If the response does not fit, part of it is written to a temporary file. When buffering is disabled, Nginx forwards data to the client synchronously, limited by proxy_buffer_size .
Nginx Source Analysis
The upstream read routine resides in src/event/ngx_event_pipe.c in the function ngx_event_pipe_read_upstream. It creates temporary buffers based on p->bufs.num and p->bufs.size, which correspond to the proxy_buffers directive.
static ngx_int_t
ngx_event_pipe_read_upstream(ngx_event_pipe_t *p)
{
for ( ;; ) {
if (p->free_raw_bufs) {
// ...
} else if (p->allocated < p->bufs.num) {
b = ngx_create_temp_buf(p->pool, p->bufs.size);
if (b == NULL) {
return NGX_ABORT;
}
p->allocated++;
}
}
}Postscript
Additional diagnostics such as strace and systemtap can reveal allocation paths in black‑box programs. The test also uncovered that an unreasonable worker_connections setting can cause Nginx to consume 14 GB of memory even without massive traffic. Understanding low‑level mechanisms and proper tuning are essential for high‑concurrency deployments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
