How Nginx Solves the Thundering Herd Problem with epoll and Advanced Techniques

This article explains the thundering herd effect, walks through Nginx's master‑worker architecture and its use of epoll, and compares three practical solutions—accept_mutex, EPOLLEXCLUSIVE, and SO_REUSEPORT—to eliminate wasted wake‑ups in high‑concurrency servers.

Open Source Linux
Open Source Linux
Open Source Linux
How Nginx Solves the Thundering Herd Problem with epoll and Advanced Techniques

What is the thundering herd effect?

The thundering herd (thundering herd) occurs when multiple processes or threads are blocked waiting for the same event; when the event occurs, all are awakened but only one can acquire the resource, forcing the others back to sleep and wasting CPU cycles.

In simple terms, it is like a thunderclap that wakes many people, yet only one goes to fetch the clothes.

Causes & Issues

When a server spawns many workers to listen for requests, a single incoming request can wake all workers, but only one can actually accept and handle it, leading to repeated wake‑sleep cycles and costly context switches.

Nginx Architecture

Nginx separates processes into a master and multiple workers (a classic master‑worker strategy). The master handles configuration, signal processing, and listening socket creation, while workers handle the actual request processing.

Requests bypass the master and are directly handled by workers, raising the question of which worker should accept a given request.

Nginx Uses epoll

Each worker creates its own epoll instance to monitor the shared listening socket, allowing efficient event‑driven I/O.

Master’s Work

ngx_open_listening_sockets(ngx_cycle_t *cycle){
    ...
    for (i = 0; i < cycle->listening.nelts; i++) {
        ...
        if (bind(s, ls[i].sockaddr, ls[i].socklen) == -1) {
            if (listen(s, ls[i].backlog) == -1) {
                ...
            }
        }
    }
}

The master binds the configured ports and then forks worker processes, copying the task structure so workers inherit the listening sockets.

Worker’s Work

ngx_epoll_init(ngx_cycle_t *cycle, ngx_msec_t timer){
    ngx_epoll_conf_t *epcf;
    epcf = ngx_event_get_conf(cycle->conf_ctx, ngx_epoll_module);
    if (ep == -1) {
        ep = epoll_create(cycle->connection_n / 2);
    }
    ...
}

Each worker creates its own epoll object; the listening socket is shared among them.

Key Problem

When a request arrives, all workers could be awakened, but only one should actually accept it; otherwise, unnecessary wake‑ups degrade performance.

Solutions

accept_mutex (application‑level lock)

EPOLLEXCLUSIVE (kernel‑level flag)

SO_REUSEPORT (kernel‑level socket option)

accept_mutex

Workers compete for a mutex; the one that acquires the lock handles the request, while others go back to sleep. This method is simple and fair but can introduce latency.

EPOLLEXCLUSIVE

EPOLLEXCLUSIVE, added in Linux 4.5, reduces the thundering herd by waking only one of the processes waiting on a shared epoll file descriptor.

It lowers the probability of multiple workers being awakened, but does not eliminate it entirely because the socket remains shared.

SO_REUSEPORT

Since Nginx 1.9.1, the reuseport socket option allows multiple workers to bind the same port; the kernel performs load‑balancing and wakes only one worker per connection.

http {
    server {
        listen 80 reuseport;
        server_name localhost;
        # ...
    }
}

Benchmarks show significant performance gains, though the approach can still suffer when a busy worker receives a new connection while still processing a previous one.

Summary

The article introduces the thundering herd effect, explains Nginx’s master‑worker model and its epoll‑based event handling, and evaluates three mitigation strategies—accept_mutex, EPOLLEXCLUSIVE, and SO_REUSEPORT—highlighting their trade‑offs in real‑world high‑concurrency scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendconcurrencyNginxepollSocketthundering herd
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.