How Redis Implements Efficient I/O Multiplexing: A Deep Dive into select, epoll, and kqueue
This article explains why Redis, a single‑threaded server, adopts I/O multiplexing to avoid blocking, compares blocking I/O with multiplexed models, and details the internal implementation of select, epoll, and kqueue wrappers that power Redis's high‑performance event loop.
Various I/O Models
Redis runs in a single thread, processing commands sequentially. Because read/write operations are blocking, a single blocked I/O call can stall the whole server, which is why I/O multiplexing is essential.
Blocking I/O
When read or write is called on a file descriptor (FD) that is not ready, the entire Redis process becomes unresponsive to other clients.
The traditional blocking model is simple but unsuitable for handling many concurrent clients.
I/O Multiplexing
To serve multiple Redis clients efficiently, Redis uses an I/O multiplexing model that can monitor many FDs simultaneously.
The core function in this model is select, which watches multiple FDs for readability or writability and returns the number of ready descriptors.
The specific usage of select is widely documented online; therefore it is not covered in detail here. Other multiplexing functions such as epoll , kqueue , and evport offer better performance.
Reactor Design Pattern
Redis implements a Reactor pattern: a single file‑event handler monitors all network connections (each represented by an FD). When events like accept, read, write, or close occur, the handler dispatches them to the appropriate callbacks.
I/O Multiplexing Module
The module abstracts the underlying system calls ( select, epoll, evport, kqueue) and presents a uniform API to the upper layers.
static int aeApiCreate(aeEventLoop *eventLoop) static int aeApiResize(aeEventLoop *eventLoop, int setsize) static void aeApiFree(aeEventLoop *eventLoop) static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) static void aeApiDelEvent(aeEventLoop *eventLoop, int fd, int mask) static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp)Each sub‑module stores its context in an aeApiState structure, which is kept inside eventLoop->apidata and never exposed to higher layers.
Wrapping the select Function
The select wrapper works as follows:
int fd = /* file descriptor */
fd_set rfds;
FD_ZERO(&rfds);
FD_SET(fd, &rfds);
for (;;) {
select(fd+1, &rfds, NULL, NULL, NULL);
if (FD_ISSET(fd, &rfds)) {
/* fd becomes readable */
}
}Initialize an fd_set for readable descriptors.
Add the target fd to the set with FD_SET.
Call select to monitor the set.
When select returns, check which fds are ready and handle them.
The Redis implementation creates the sets in aeApiCreate:
static int aeApiCreate(aeEventLoop *eventLoop) {
aeApiState *state = zmalloc(sizeof(aeApiState));
if (!state) return -1;
FD_ZERO(&state->rfds);
FD_ZERO(&state->wfds);
eventLoop->apidata = state;
return 0;
}Adding or removing events updates the sets with FD_SET / FD_CLR in aeApiAddEvent and aeApiDelEvent:
static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
aeApiState *state = eventLoop->apidata;
if (mask & AE_READABLE) FD_SET(fd, &state->rfds);
if (mask & AE_WRITABLE) FD_SET(fd, &state->wfds);
return 0;
}The polling function copies the sets, calls select, and fills the eventLoop->fired array with ready fds:
static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
aeApiState *state = eventLoop->apidata;
int retval, j, numevents = 0;
memcpy(&state->_rfds, &state->rfds, sizeof(fd_set));
memcpy(&state->_wfds, &state->wfds, sizeof(fd_set));
retval = select(eventLoop->maxfd+1, &state->_rfds, &state->_wfds, NULL, tvp);
if (retval > 0) {
for (j = 0; j <= eventLoop->maxfd; j++) {
int mask = 0;
aeFileEvent *fe = &eventLoop->events[j];
if (fe->mask == AE_NONE) continue;
if (fe->mask & AE_READABLE && FD_ISSET(j, &state->_rfds)) mask |= AE_READABLE;
if (fe->mask & AE_WRITABLE && FD_ISSET(j, &state->_wfds)) mask |= AE_WRITABLE;
eventLoop->fired[numevents].fd = j;
eventLoop->fired[numevents].mask = mask;
numevents++;
}
}
return numevents;
}Wrapping the epoll Function
For platforms that support epoll, Redis creates an epoll instance and stores it in aeApiState:
static int aeApiCreate(aeEventLoop *eventLoop) {
aeApiState *state = zmalloc(sizeof(aeApiState));
if (!state) return -1;
state->events = zmalloc(sizeof(struct epoll_event) * eventLoop->setsize);
if (!state->events) { zfree(state); return -1; }
state->epfd = epoll_create(1024);
if (state->epfd == -1) { zfree(state->events); zfree(state); return -1; }
eventLoop->apidata = state;
return 0;
}Adding an event uses epoll_ctl with either EPOLL_CTL_ADD or EPOLL_CTL_MOD depending on whether the fd was already monitored:
static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
aeApiState *state = eventLoop->apidata;
struct epoll_event ee = {0};
int op = eventLoop->events[fd].mask == AE_NONE ? EPOLL_CTL_ADD : EPOLL_CTL_MOD;
mask |= eventLoop->events[fd].mask;
if (mask & AE_READABLE) ee.events |= EPOLLIN;
if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;
ee.data.fd = fd;
if (epoll_ctl(state->epfd, op, fd, &ee) == -1) return -1;
return 0;
}The poll function waits for events and translates them back to the generic mask:
static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
aeApiState *state = eventLoop->apidata;
int retval, j, numevents = 0;
retval = epoll_wait(state->epfd, state->events, eventLoop->setsize,
tvp ? (tvp->tv_sec*1000 + tvp->tv_usec/1000) : -1);
if (retval > 0) {
for (j = 0; j < retval; j++) {
int mask = 0;
struct epoll_event *e = &state->events[j];
if (e->events & EPOLLIN) mask |= AE_READABLE;
if (e->events & EPOLLOUT) mask |= AE_WRITABLE;
if (e->events & EPOLLERR) mask |= AE_WRITABLE;
if (e->events & EPOLLHUP) mask |= AE_WRITABLE;
eventLoop->fired[j].fd = e->data.fd;
eventLoop->fired[j].mask = mask;
}
numevents = retval;
}
return numevents;
}Choosing the Sub‑module
Redis selects the most efficient multiplexing implementation available on the target platform using compile‑time macros:
#ifdef HAVE_EVPORT
#include "ae_evport.c"
#else
#ifdef HAVE_EPOLL
#include "ae_epoll.c"
#else
#ifdef HAVE_KQUEUE
#include "ae_kqueue.c"
#else
#include "ae_select.c"
#endif
#endif
#endifIf none of the advanced mechanisms are present, select is used as a fallback, despite its higher time complexity and 1024‑fd limit.
Conclusion
Redis’s I/O multiplexing module is concise and portable: macro guards ensure the best available mechanism (evport, epoll, kqueue, or select) is used, while a uniform API hides platform differences. This design lets a single‑process Redis serve tens of thousands of connections efficiently without the complexity of multi‑process architectures.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
