Why Does the Thundering Herd Problem Slow Your Linux Server and How to Fix It
The article explains the thundering herd problem in Linux—how multiple processes or threads wake up simultaneously for a single event, causing wasted CPU cycles and lock overhead—then examines its impact on accept, poll, and epoll, and presents practical solutions such as EPOLLEXCLUSIVE, SO_REUSEPORT, thread pools, and spin‑locks.
What Is the Thundering Herd Problem?
In Linux, the thundering herd problem occurs when many processes or threads block on the same event (e.g., a new network connection). When the event arrives, the kernel wakes all waiters, but only one can actually handle the event; the rest go back to sleep, wasting CPU time on unnecessary context switches.
Impact of the Thundering Herd
System Performance Loss
Each spurious wake‑up generates a context switch, which saves and restores registers and pollutes CPU caches. In a multi‑process web server, all children wake on a new connection, yet only one succeeds, while the others incur scheduling overhead.
Resource Competition and Lock Overhead
Developers often protect the shared resource with a mutex. The lock itself adds latency and can become a bottleneck under high concurrency. For example, Nginx uses a mutex to avoid the herd, but the lock still incurs acquisition, release, and possible dead‑lock detection costs.
Typical Thundering Herd Scenarios
accept() Herd
When a listening socket is inherited by multiple child processes (via fork()), all children block on accept(). A new connection wakes every child, yet only one can accept it; the others receive EAGAIN and return to sleep.
Since Linux 2.6 the kernel marks the waiting‑queue entries with WQ_FLAG_EXCLUSIVE, so only the first waiter is woken, eliminating the classic accept herd.
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>
#define PROCESS_NUM 10
int main() {
int fd = socket(PF_INET, SOCK_STREAM, 0);
int connfd;
int pid;
char sendbuff[1024];
struct sockaddr_in serveraddr;
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons(1234);
bind(fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));
listen(fd, 1024);
for (int i = 0; i < PROCESS_NUM; ++i) {
pid = fork();
if (pid == 0) {
while (1) {
connfd = accept(fd, NULL, NULL);
sprintf(sendbuff, "process PID = %d
", getpid());
send(connfd, sendbuff, strlen(sendbuff)+1, 0);
printf("process %d accept success
", getpid());
close(connfd);
}
}
}
wait(0);
return 0;
}epoll Herd
If multiple processes share the same epoll instance (or each creates its own after fork()), the kernel may wake all waiters. Modern kernels use exclusive wake‑ups for the first waiter, but the behaviour differs between edge‑triggered (ET) and level‑triggered (LT) modes.
In ET mode the event is delivered only once, so only the first woken process handles it. In LT mode the event remains pending, causing the kernel to wake additional processes until the event is cleared.
#include <sys/epoll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PORT 8080
#define MAX_EVENTS 10
int main() {
int server_fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = INADDR_ANY, .sin_port = htons(PORT) };
bind(server_fd, (struct sockaddr *)&addr, sizeof(addr));
listen(server_fd, 10);
int epfd = epoll_create1(0);
struct epoll_event ev = { .events = EPOLLIN | EPOLLET, .data.fd = server_fd };
epoll_ctl(epfd, EPOLL_CTL_ADD, server_fd, &ev);
struct epoll_event events[MAX_EVENTS];
while (1) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
for (int i = 0; i < n; ++i) {
if (events[i].data.fd == server_fd) {
int client = accept(server_fd, NULL, NULL);
// handle client …
close(client);
}
}
}
}poll()/select() Herd
Both poll() and select() suffer from the same issue: all blocked processes are awakened when a descriptor becomes ready, but only one can successfully accept the connection.
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>
#include <errno.h>
#include <poll.h>
#define PROCESS_NUM 10
int main() {
int fd = socket(PF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
struct sockaddr_in serveraddr = { .sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_ANY), .sin_port = htons(2222) };
bind(fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));
listen(fd, 1024);
for (int i = 0; i < PROCESS_NUM; ++i) {
pid_t pid = fork();
if (pid == 0) {
struct pollfd pfd = { .fd = fd, .events = POLLIN };
while (1) {
int ret = poll(&pfd, 1, -1);
if (ret > 0 && (pfd.revents & POLLIN)) {
int cfd = accept(fd, NULL, NULL);
if (cfd >= 0) {
// handle client …
close(cfd);
}
}
}
}
}
wait(0);
return 0;
}How to Eliminate the Thundering Herd
epoll + EPOLLEXCLUSIVE (Linux 4.5+)
Adding the EPOLLEXCLUSIVE flag guarantees that only one waiting process is woken for a new connection, removing the herd effect.
#include <sys/epoll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PORT 8080
#define MAX_EVENTS 10
int main() {
int server_fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = INADDR_ANY, .sin_port = htons(PORT) };
bind(server_fd, (struct sockaddr *)&addr, sizeof(addr));
listen(server_fd, 10);
int epfd = epoll_create1(0);
struct epoll_event ev = { .events = EPOLLIN | EPOLLEXCLUSIVE, .data.fd = server_fd };
epoll_ctl(epfd, EPOLL_CTL_ADD, server_fd, &ev);
struct epoll_event events[MAX_EVENTS];
while (1) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
for (int i = 0; i < n; ++i) {
if (events[i].data.fd == server_fd) {
int client = accept(server_fd, NULL, NULL);
// handle client …
close(client);
}
}
}
}Load‑Balancing with SO_REUSEPORT (Linux 3.9+)
The SO_REUSEPORT socket option lets several processes bind to the same port. The kernel then distributes incoming connections among them, effectively avoiding the herd.
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string.h>
#define PORT 8888
#define WORKER 4
void worker(int id) {
int fd = socket(PF_INET, SOCK_STREAM, 0);
int opt = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, &opt, sizeof(opt));
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = inet_addr("127.0.0.1"), .sin_port = htons(PORT) };
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
listen(fd, 5);
while (1) {
int cfd = accept(fd, NULL, NULL);
// handle client …
close(cfd);
}
}
int main() {
for (int i = 0; i < WORKER; ++i) {
if (fork() == 0) {
worker(i);
exit(0);
}
}
for (int i = 0; i < WORKER; ++i) wait(NULL);
return 0;
}Thread‑Pool Model
A thread pool creates a fixed number of worker threads that pull connection descriptors from a shared queue. Only the thread that dequeues the descriptor processes it, eliminating the need for multiple processes to race on accept().
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <string.h>
#define PORT 8080
#define THREAD_NUM 4
#define QUEUE_SIZE 100
typedef struct { int fd; } Task;
typedef struct { Task q[QUEUE_SIZE]; int front, rear; pthread_mutex_t m; pthread_cond_t c; } TaskQueue;
TaskQueue tq;
void initQueue() { tq.front = tq.rear = 0; pthread_mutex_init(&tq.m, NULL); pthread_cond_init(&tq.c, NULL); }
void enqueue(int fd) { pthread_mutex_lock(&tq.m); while ((tq.rear+1)%QUEUE_SIZE == tq.front) pthread_cond_wait(&tq.c, &tq.m); tq.q[tq.rear].fd = fd; tq.rear = (tq.rear+1)%QUEUE_SIZE; pthread_cond_signal(&tq.c); pthread_mutex_unlock(&tq.m); }
int dequeue() { pthread_mutex_lock(&tq.m); while (tq.front == tq.rear) pthread_cond_wait(&tq.c, &tq.m); int fd = tq.q[tq.front].fd; tq.front = (tq.front+1)%QUEUE_SIZE; pthread_cond_signal(&tq.c); pthread_mutex_unlock(&tq.m); return fd; }
void *worker(void *arg) {
while (1) {
int fd = dequeue();
char buf[1024];
ssize_t n = read(fd, buf, sizeof(buf));
if (n > 0) { write(fd, buf, n); }
close(fd);
}
return NULL;
}
int main() {
int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = INADDR_ANY, .sin_port = htons(PORT) };
bind(listen_fd, (struct sockaddr *)&addr, sizeof(addr));
listen(listen_fd, 10);
initQueue();
pthread_t th[THREAD_NUM];
for (int i = 0; i < THREAD_NUM; ++i) pthread_create(&th[i], NULL, worker, NULL);
while (1) {
int cfd = accept(listen_fd, NULL, NULL);
if (cfd >= 0) enqueue(cfd);
}
return 0;
}Spin‑Lock + Optimized Wake‑Up
In highly contended sections a spin‑lock avoids the cost of putting a thread to sleep. Combined with a condition variable that wakes only one waiter, this reduces unnecessary wake‑ups.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define THREAD_NUM 10
pthread_spinlock_t lock;
pthread_cond_t cond;
int resource = 0;
void *thr(void *arg) {
int id = *(int *)arg;
pthread_spin_lock(&lock);
while (resource == 0) pthread_cond_wait(&cond, &lock);
printf("Thread %d processing
", id);
resource--;
pthread_spin_unlock(&lock);
return NULL;
}
int main() {
pthread_t t[THREAD_NUM];
int ids[THREAD_NUM];
pthread_spin_init(&lock, PTHREAD_PROCESS_PRIVATE);
pthread_cond_init(&cond, NULL);
for (int i = 0; i < THREAD_NUM; ++i) {
ids[i] = i;
pthread_create(&t[i], NULL, thr, &ids[i]);
}
pthread_spin_lock(&lock);
resource = 1;
pthread_cond_signal(&cond);
pthread_spin_unlock(&lock);
for (int i = 0; i < THREAD_NUM; ++i) pthread_join(t[i], NULL);
pthread_spin_destroy(&lock);
pthread_cond_destroy(&cond);
return 0;
}epoll Thundering Herd Case Study
ET mode does not exhibit the herd problem, while LT mode does.
A test program creates ten processes that share a single epoll instance. In ET mode only one process ever receives the new connection because the kernel adds the wait node with the WQ_FLAG_EXCLUSIVE flag and places it at the head of the wait queue. In LT mode the event remains on the ready list, so after the first process handles the connection the kernel wakes additional waiters, reproducing the classic herd effect.
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <string.h>
#include <netinet/in.h>
#include <unistd.h>
#include <errno.h>
#include <sys/epoll.h>
#define MAXEVENTS 64
#define PROCESS_NUM 10
int main() {
int fd = socket(PF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_ANY), .sin_port = htons(2222) };
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
listen(fd, 1024);
int epfd = epoll_create1(0);
struct epoll_event ev = { .events = EPOLLIN | EPOLLET, .data.fd = fd }; // change to EPOLLIN for LT
epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
struct epoll_event *events = calloc(MAXEVENTS, sizeof(*events));
for (int i = 0; i < PROCESS_NUM; ++i) {
pid_t pid = fork();
if (pid == 0) {
while (1) {
printf("pid %d waiting on epoll
", getpid());
int n = epoll_wait(epfd, events, MAXEVENTS, -1);
for (int j = 0; j < n; ++j) {
if (events[j].data.fd == fd) {
int cfd = accept(fd, NULL, NULL);
if (cfd >= 0) {
printf("pid %d accepted fd %d
", getpid(), cfd);
close(cfd);
}
}
}
}
}
}
wait(0);
return 0;
}Running the program with the ET mask shows a single process repeatedly handling connections, while the LT mask (by removing EPOLLET) results in many processes being woken, confirming the theoretical analysis.
The kernel’s wait‑queue and ready‑list mechanics explain why the thundering herd persists in LT mode and how flags like EPOLLEXCLUSIVE, SO_REUSEPORT, or a thread‑pool design effectively eliminate it.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
