Why Does the Thundering Herd Problem Still Appear with epoll? Deep Dive and Fixes
This article explains the thundering herd phenomenon in multi‑process/thread servers, details its performance costs, demonstrates the issue with accept() and epoll_wait() through C examples, explores thread‑level herd, and presents mitigation techniques such as accept mutexes, SO_REUSEPORT, and Nginx's handling.
What Is the Thundering Herd Effect?
The thundering herd (also called the “thunder‑herd” effect) occurs when many processes or threads block on the same event; when the event occurs, the kernel wakes all of them, but only one can acquire the resource while the others go back to sleep, causing wasted CPU cycles.
An analogy is a flock of pigeons startled by a single grain of seed: all fly to the grain, but only one actually gets it.
System Costs of the Thundering Herd
Excessive context switches: the scheduler repeatedly saves and restores registers and run‑queue data, reducing useful CPU time.
Lock contention: protecting the shared resource adds further overhead.
Cache‑line bouncing and other indirect costs on multi‑core systems.
Thundering Herd in accept()
Historically, Linux kernels suffered from a herd when multiple processes called accept() on the same listening socket. Modern kernels wake only one process, eliminating the problem.
#include<stdio.h>
#include<stdlib.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<sys/wait.h>
#include<string.h>
#include<netinet/in.h>
#include<unistd.h>
#define PROCESS_NUM 10
int main()
{
int fd = socket(PF_INET, SOCK_STREAM, 0);
int connfd;
int pid;
char sendbuff[1024];
struct sockaddr_in serveraddr;
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons(1234);
bind(fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));
listen(fd, 1024);
for(int i = 0; i < PROCESS_NUM; ++i){
pid = fork();
if(pid == 0){
while(1){
connfd = accept(fd, (struct sockaddr *)NULL, NULL);
snprintf(sendbuff, sizeof(sendbuff), "接收到accept事件的进程PID = %d
", getpid());
send(connfd, sendbuff, strlen(sendbuff)+1, 0);
printf("process %d accept success
", getpid());
close(connfd);
}
}
}
wait(0);
return 0;
}Running the program and connecting with telnet shows that only one process reports a successful accept, confirming that the modern kernel no longer exhibits the herd for accept().
Thundering Herd in epoll_wait()
When many processes block on epoll_wait() for the same listening socket, a new connection can wake multiple processes. The following program demonstrates the partial herd and a fix that forces all processes to be awakened.
#include<stdio.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<unistd.h>
#include<sys/epoll.h>
#include<netdb.h>
#include<stdlib.h>
#include<fcntl.h>
#include<sys/wait.h>
#include<errno.h>
#define PROCESS_NUM 10
#define MAXEVENTS 64
int sock_creat_bind(char *port){
int sock_fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in serveraddr;
serveraddr.sin_family = AF_INET;
serveraddr.sin_port = htons(atoi(port));
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
bind(sock_fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));
return sock_fd;
}
int make_nonblocking(int fd){
int val = fcntl(fd, F_GETFL);
val |= O_NONBLOCK;
if(fcntl(fd, F_SETFL, val) < 0){
perror("fcntl set");
return -1;
}
return 0;
}
int main(int argc, char *argv[]){
int sock_fd, epoll_fd;
struct epoll_event event;
struct epoll_event *events;
if(argc < 2){
printf("usage: %s [port]
", argv[0]);
exit(1);
}
if((sock_fd = sock_creat_bind(argv[1])) < 0){ perror("socket and bind"); exit(1); }
if(make_nonblocking(sock_fd) < 0){ perror("make non blocking"); exit(1); }
if(listen(sock_fd, SOMAXCONN) < 0){ perror("listen"); exit(1); }
if((epoll_fd = epoll_create(MAXEVENTS)) < 0){ perror("epoll_create"); exit(1); }
event.data.fd = sock_fd;
event.events = EPOLLIN;
if(epoll_ctl(epoll_fd, EPOLL_CTL_ADD, sock_fd, &event) < 0){ perror("epoll_ctl"); exit(1); }
events = calloc(MAXEVENTS, sizeof(event));
for(int i = 0; i < PROCESS_NUM; ++i){
int pid = fork();
if(pid == 0){
while(1){
int num = epoll_wait(epoll_fd, events, MAXEVENTS, -1);
printf("process %d returned from epoll_wait
", getpid());
sleep(2); // force all processes to stay awake long enough
for(int j = 0; j < num; ++j){
if((events[j].events & EPOLLERR) || (events[j].events & EPOLLHUP) || !(events[j].events & EPOLLIN)){
fprintf(stderr, "epoll error
");
close(events[j].data.fd);
continue;
} else if(events[j].data.fd == sock_fd){
struct sockaddr in_addr;
socklen_t in_len = sizeof(in_addr);
if(accept(sock_fd, &in_addr, &in_len) < 0)
printf("process %d accept failed!
", getpid());
else
printf("process %d accept successful!
", getpid());
}
}
}
}
}
wait(0);
free(events);
close(sock_fd);
return 0;
}Initial runs show only one process handling the connection, while others receive EAGAIN. Adding a sleep(2) after epoll_wait makes all processes wake up, exposing the herd effect for epoll.
Thread‑Level Thundering Herd
When a condition variable is broadcast with pthread_cond_broadcast(), all waiting threads are awakened, but only one obtains the resource.
printf("初始的红包情况:<个数:%d 金额:%d.%02d>
", item.number, item.total/100, item.total%100);
pthread_cond_broadcast(&temp.cond); // wake all threads
pthread_mutex_unlock(&temp.mutex); // unlock
sleep(1);Mitigation Techniques
Accept mutex (locking) : Nginx uses an accept mutex so that only one worker can call accept() at a time. The code path is in ngx_process_events_and_timers() and involves ngx_trylock_accept_mutex() and posting events until the lock is released.
SO_REUSEPORT : Since Linux 3.9, multiple processes can bind to the same IP/port. The kernel distributes incoming connections using a hash of the source address, providing true load‑balancing without user‑space locking. Two modes exist: hot‑standby (only the first socket handles traffic) and load‑balancing (hash‑based distribution).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
