Mastering Linux I/O Multiplexing: select, poll, and epoll Explained with Real Code
This article explains the concepts, advantages, limitations, and practical usage of Linux I/O multiplexing mechanisms—select, poll, and epoll—through analogies, detailed explanations, code examples, and common interview questions, helping developers choose the right tool for high‑concurrency network programming.
IO multiplexing allows a single thread to monitor multiple file descriptors (FDs) simultaneously, notifying the application when any FD becomes ready for reading, writing, or an exception, thus avoiding the overhead of creating a thread per FD.
1. Life Example
Imagine a convenience‑store owner watching customers (FDs). The owner needs to know when a new customer enters, wants to pay, or attempts theft. Different monitoring techniques correspond to different assistants.
select: a guard who watches all customers but tells you only that "something happened" without specifying who; you must ask each customer individually. Limited to 1024 customers.
poll: a guard similar to select but without a hard limit on the number of customers; you still need to ask each one.
epoll: a smart system that directly tells you which customer triggered which event.
2. select – Basic but Limited Monitoring
2.1 How select works
select is the earliest widely used I/O multiplexing API. It requires three FD sets (read, write, exception) and a timeout. Before calling select, you add the FDs you want to monitor to the appropriate sets.
#include <sys/select.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#define PORT 8888
#define MAX_CLIENTS 100
int main() {
int server_fd, new_socket, client_fds[MAX_CLIENTS];
struct sockaddr_in address;
int addrlen = sizeof(address);
fd_set read_fds, temp_fds;
int max_sd;
// create socket
server_fd = socket(AF_INET, SOCK_STREAM, 0);
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(PORT);
// bind and listen
bind(server_fd, (struct sockaddr *)&address, sizeof(address));
listen(server_fd, 3);
// initialize client array
for (int i = 0; i < MAX_CLIENTS; i++) {
client_fds[i] = -1;
}
FD_ZERO(&read_fds);
FD_SET(server_fd, &read_fds);
max_sd = server_fd;
while (1) {
temp_fds = read_fds;
int activity = select(max_sd + 1, &temp_fds, NULL, NULL, NULL);
if (activity < 0) {
perror("select error");
break;
} else if (activity > 0) {
if (FD_ISSET(server_fd, &temp_fds)) {
new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t *)&addrlen);
for (int i = 0; i < MAX_CLIENTS; i++) {
if (client_fds[i] == -1) {
client_fds[i] = new_socket;
break;
}
}
FD_SET(new_socket, &read_fds);
if (new_socket > max_sd) max_sd = new_socket;
}
for (int i = 0; i < MAX_CLIENTS; i++) {
int sd = client_fds[i];
if (sd != -1 && FD_ISSET(sd, &temp_fds)) {
char buffer[1024] = {0};
int valread = read(sd, buffer, 1024);
if (valread == 0) {
close(sd);
FD_CLR(sd, &read_fds);
client_fds[i] = -1;
} else {
printf("Received from client: %s
", buffer);
}
}
}
}
}
close(server_fd);
return 0;
}When select returns, the kernel copies the three FD sets back to user space. The application must iterate over the sets with FD_ISSET to find which FD is ready.
2.2 Characteristics and Limitations
Maximum of 1024 FDs per process (FD_SETSIZE), which is insufficient for high‑concurrency servers.
Every call copies the entire FD set between user and kernel space, causing noticeable overhead.
After select returns, the application must scan all FDs (O(n) time), which becomes costly when n is large.
2.3 Suitable Scenarios
Select is appropriate for small‑scale connections and simple concurrency where performance is not critical, such as a basic TCP server handling a few dozen clients.
3. poll – Improved but Still Limited
3.1 How poll works
poll replaces the three FD sets with an array of struct pollfd, each containing an FD, the events to monitor, and the events that occurred.
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#define PORT 8888
#define MAX_CLIENTS 100
int main() {
int server_fd, new_socket, client_fds[MAX_CLIENTS];
struct sockaddr_in address;
int addrlen = sizeof(address);
struct pollfd fds[MAX_CLIENTS + 1];
int num_fds = 1;
server_fd = socket(AF_INET, SOCK_STREAM, 0);
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(PORT);
bind(server_fd, (struct sockaddr *)&address, sizeof(address));
listen(server_fd, 3);
for (int i = 0; i < MAX_CLIENTS; i++) client_fds[i] = -1;
fds[0].fd = server_fd;
fds[0].events = POLLIN;
while (1) {
int activity = poll(fds, num_fds, -1);
if (activity < 0) {
perror("poll error");
break;
} else if (activity > 0) {
if (fds[0].revents & POLLIN) {
new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t *)&addrlen);
for (int i = 0; i < MAX_CLIENTS; i++) {
if (client_fds[i] == -1) {
client_fds[i] = new_socket;
fds[num_fds].fd = new_socket;
fds[num_fds].events = POLLIN;
num_fds++;
break;
}
}
}
for (int i = 1; i < num_fds; i++) {
int sd = fds[i].fd;
if (fds[i].revents & POLLIN) {
char buffer[1024] = {0};
int valread = read(sd, buffer, 1024);
if (valread == 0) {
close(sd);
for (int j = i; j < num_fds - 1; j++) {
fds[j] = fds[j + 1];
client_fds[j - 1] = client_fds[j];
}
num_fds--;
} else {
printf("Received from client: %s
", buffer);
}
}
}
}
}
close(server_fd);
return 0;
}poll copies the pollfd array to kernel space, the kernel checks each FD, sets the revents field for ready FDs, and copies the array back.
3.2 Improvements Over select
No hard limit on the number of FDs; limited only by system resources.
Single array makes adding or modifying monitored events simpler.
3.3 Characteristics and Limitations
Still requires O(n) scanning of the entire pollfd array on each call.
Each call copies the whole array between user and kernel space, incurring overhead.
Only supports level‑triggered (LT) notifications.
3.4 Suitable Scenarios
poll works well when the number of connections is moderate and the active‑connection ratio is uncertain, such as an instant‑messaging server where many clients are idle most of the time.
4. epoll – High‑Performance Monitoring
4.1 How epoll works
epoll introduces two notification modes:
Level‑triggered (LT) : the default mode; as long as an FD remains ready, epoll_wait keeps returning it.
Edge‑triggered (ET) : notifies only when the FD state changes (e.g., from not ready to ready). The application must read/write until it receives EAGAIN/EWOULDBLOCK.
4.2 Core Data Structures
Red‑black tree : stores all registered FDs, providing O(log n) insert/delete/search.
Ready list : a double‑linked list of FDs that have become ready; epoll_wait traverses only this list (O(1) to fetch events).
mmap : shared memory between kernel and user space reduces data copying.
4.3 Core APIs
Creating an epoll instance:
int epfd = epoll_create1(0);
if (epfd == -1) {
perror("epoll_create1");
exit(EXIT_FAILURE);
}Adding, deleting, or modifying an FD:
struct epoll_event event;
event.events = EPOLLIN;
event.data.fd = listen_sock;
if (epoll_ctl(epfd, EPOLL_CTL_ADD, listen_sock, &event) == -1) {
perror("epoll_ctl: listen_sock");
exit(EXIT_FAILURE);
}Waiting for events:
struct epoll_event events[MAX_EVENTS];
int nfds = epoll_wait(epfd, events, MAX_EVENTS, -1);
if (nfds == -1) {
perror("epoll_wait");
exit(EXIT_FAILURE);
}
for (int n = 0; n < nfds; ++n) {
if (events[n].events & EPOLLIN) {
handle_incoming_data(events[n].data.fd);
}
}4.4 Implementation Principle
When an epoll instance is created, the kernel allocates an eventpoll structure containing a red‑black tree (all registered FDs) and a ready list. Each registered FD is linked to the device driver via a callback. When an event occurs, the driver calls ep_poll_callback, which moves the FD to the ready list. epoll_wait simply checks the ready list, achieving O(1) event retrieval.
5. Why epoll Is Efficient
5.1 Event‑Driven Mechanism
Instead of polling all FDs, the kernel notifies the application only when an FD becomes ready, eliminating unnecessary checks.
5.2 Data‑Structure Advantage
The red‑black tree provides fast O(log n) management of registered FDs, while the ready list allows O(1) retrieval of active events.
5.3 Data‑Transfer Optimization
Using mmap creates shared memory between kernel and user space, reducing copy operations and speeding up data transfer.
6. Common I/O Multiplexing Interview Questions
What is I/O multiplexing? It is a synchronous I/O model that lets a single thread monitor multiple file handles and get notified when any becomes ready.
Why do we need it? It avoids the thread‑per‑connection overhead of blocking I/O and the CPU waste of busy‑polling non‑blocking I/O.
Common implementations? select, poll, epoll, and kqueue.
Drawbacks of select? FD limit (usually 1024), per‑call copying of FD sets, linear scanning, and only level‑triggered mode.
Difference between poll and select? poll uses a dynamic array without a hard FD limit but still copies the array each call and scans linearly.
Why is epoll efficient? Event‑driven notification, reduced memory copies via mmap, and high FD limits.
Difference between LT and ET? LT repeatedly reports a ready FD until the condition is cleared; ET reports only on state changes, requiring the application to drain the FD.
When to use I/O multiplexing? In servers handling many concurrent connections with relatively low per‑connection activity (e.g., Redis, Nginx).
Does epoll have drawbacks? It is Linux‑specific, limiting cross‑platform portability.
Are select/poll/epoll asynchronous? No; they are synchronous APIs—the actual read/write operations still block the thread.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
