How Redis Implements Efficient I/O Multiplexing: A Deep Dive into select, epoll, and kqueue

This article explains why Redis, a single‑threaded server, adopts I/O multiplexing to avoid blocking, compares blocking I/O with multiplexed models, and details the internal implementation of select, epoll, and kqueue wrappers that power Redis's high‑performance event loop.

Programmer DD
Programmer DD
Programmer DD
How Redis Implements Efficient I/O Multiplexing: A Deep Dive into select, epoll, and kqueue

Various I/O Models

Redis runs in a single thread, processing commands sequentially. Because read/write operations are blocking, a single blocked I/O call can stall the whole server, which is why I/O multiplexing is essential.

Blocking I/O

When read or write is called on a file descriptor (FD) that is not ready, the entire Redis process becomes unresponsive to other clients.

The traditional blocking model is simple but unsuitable for handling many concurrent clients.

I/O Multiplexing

To serve multiple Redis clients efficiently, Redis uses an I/O multiplexing model that can monitor many FDs simultaneously.

The core function in this model is select, which watches multiple FDs for readability or writability and returns the number of ready descriptors.

The specific usage of select is widely documented online; therefore it is not covered in detail here. Other multiplexing functions such as epoll , kqueue , and evport offer better performance.

Reactor Design Pattern

Redis implements a Reactor pattern: a single file‑event handler monitors all network connections (each represented by an FD). When events like accept, read, write, or close occur, the handler dispatches them to the appropriate callbacks.

I/O Multiplexing Module

The module abstracts the underlying system calls ( select, epoll, evport, kqueue) and presents a uniform API to the upper layers.

static int aeApiCreate(aeEventLoop *eventLoop)
static int aeApiResize(aeEventLoop *eventLoop, int setsize)
static void aeApiFree(aeEventLoop *eventLoop)
static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask)
static void aeApiDelEvent(aeEventLoop *eventLoop, int fd, int mask)
static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp)

Each sub‑module stores its context in an aeApiState structure, which is kept inside eventLoop->apidata and never exposed to higher layers.

Wrapping the select Function

The select wrapper works as follows:

int fd = /* file descriptor */
fd_set rfds;
FD_ZERO(&rfds);
FD_SET(fd, &rfds);
for (;;) {
    select(fd+1, &rfds, NULL, NULL, NULL);
    if (FD_ISSET(fd, &rfds)) {
        /* fd becomes readable */
    }
}

Initialize an fd_set for readable descriptors.

Add the target fd to the set with FD_SET.

Call select to monitor the set.

When select returns, check which fds are ready and handle them.

The Redis implementation creates the sets in aeApiCreate:

static int aeApiCreate(aeEventLoop *eventLoop) {
    aeApiState *state = zmalloc(sizeof(aeApiState));
    if (!state) return -1;
    FD_ZERO(&state->rfds);
    FD_ZERO(&state->wfds);
    eventLoop->apidata = state;
    return 0;
}

Adding or removing events updates the sets with FD_SET / FD_CLR in aeApiAddEvent and aeApiDelEvent:

static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
    aeApiState *state = eventLoop->apidata;
    if (mask & AE_READABLE) FD_SET(fd, &state->rfds);
    if (mask & AE_WRITABLE) FD_SET(fd, &state->wfds);
    return 0;
}

The polling function copies the sets, calls select, and fills the eventLoop->fired array with ready fds:

static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
    aeApiState *state = eventLoop->apidata;
    int retval, j, numevents = 0;
    memcpy(&state->_rfds, &state->rfds, sizeof(fd_set));
    memcpy(&state->_wfds, &state->wfds, sizeof(fd_set));
    retval = select(eventLoop->maxfd+1, &state->_rfds, &state->_wfds, NULL, tvp);
    if (retval > 0) {
        for (j = 0; j <= eventLoop->maxfd; j++) {
            int mask = 0;
            aeFileEvent *fe = &eventLoop->events[j];
            if (fe->mask == AE_NONE) continue;
            if (fe->mask & AE_READABLE && FD_ISSET(j, &state->_rfds)) mask |= AE_READABLE;
            if (fe->mask & AE_WRITABLE && FD_ISSET(j, &state->_wfds)) mask |= AE_WRITABLE;
            eventLoop->fired[numevents].fd = j;
            eventLoop->fired[numevents].mask = mask;
            numevents++;
        }
    }
    return numevents;
}

Wrapping the epoll Function

For platforms that support epoll, Redis creates an epoll instance and stores it in aeApiState:

static int aeApiCreate(aeEventLoop *eventLoop) {
    aeApiState *state = zmalloc(sizeof(aeApiState));
    if (!state) return -1;
    state->events = zmalloc(sizeof(struct epoll_event) * eventLoop->setsize);
    if (!state->events) { zfree(state); return -1; }
    state->epfd = epoll_create(1024);
    if (state->epfd == -1) { zfree(state->events); zfree(state); return -1; }
    eventLoop->apidata = state;
    return 0;
}

Adding an event uses epoll_ctl with either EPOLL_CTL_ADD or EPOLL_CTL_MOD depending on whether the fd was already monitored:

static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
    aeApiState *state = eventLoop->apidata;
    struct epoll_event ee = {0};
    int op = eventLoop->events[fd].mask == AE_NONE ? EPOLL_CTL_ADD : EPOLL_CTL_MOD;
    mask |= eventLoop->events[fd].mask;
    if (mask & AE_READABLE) ee.events |= EPOLLIN;
    if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;
    ee.data.fd = fd;
    if (epoll_ctl(state->epfd, op, fd, &ee) == -1) return -1;
    return 0;
}

The poll function waits for events and translates them back to the generic mask:

static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
    aeApiState *state = eventLoop->apidata;
    int retval, j, numevents = 0;
    retval = epoll_wait(state->epfd, state->events, eventLoop->setsize,
                        tvp ? (tvp->tv_sec*1000 + tvp->tv_usec/1000) : -1);
    if (retval > 0) {
        for (j = 0; j < retval; j++) {
            int mask = 0;
            struct epoll_event *e = &state->events[j];
            if (e->events & EPOLLIN) mask |= AE_READABLE;
            if (e->events & EPOLLOUT) mask |= AE_WRITABLE;
            if (e->events & EPOLLERR) mask |= AE_WRITABLE;
            if (e->events & EPOLLHUP) mask |= AE_WRITABLE;
            eventLoop->fired[j].fd = e->data.fd;
            eventLoop->fired[j].mask = mask;
        }
        numevents = retval;
    }
    return numevents;
}

Choosing the Sub‑module

Redis selects the most efficient multiplexing implementation available on the target platform using compile‑time macros:

#ifdef HAVE_EVPORT
#include "ae_evport.c"
#else
#ifdef HAVE_EPOLL
#include "ae_epoll.c"
#else
#ifdef HAVE_KQUEUE
#include "ae_kqueue.c"
#else
#include "ae_select.c"
#endif
#endif
#endif

If none of the advanced mechanisms are present, select is used as a fallback, despite its higher time complexity and 1024‑fd limit.

Conclusion

Redis’s I/O multiplexing module is concise and portable: macro guards ensure the best available mechanism (evport, epoll, kqueue, or select) is used, while a uniform API hides platform differences. This design lets a single‑process Redis serve tens of thousands of connections efficiently without the complexity of multi‑process architectures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

redisI/O Multiplexingepollselectevent loop
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.