Backend Development 72 min read

A Comprehensive Guide to epoll in Linux: Principles, Design, and Practical Usage

This article explains how epoll improves Linux I/O multiplexing by using red‑black trees and ready lists, compares level‑triggered and edge‑triggered modes, details the epoll_create/epoll_ctl/epoll_wait system calls, discusses common pitfalls, and provides a complete TCP server example for handling many concurrent connections.

Deepin Linux

Feb 6, 2025

A Comprehensive Guide to epoll in Linux: Principles, Design, and Practical Usage

In the digital era, servers must handle massive concurrent connections, and epoll serves as an efficient I/O multiplexing mechanism in Linux.

epoll improves upon select/poll by maintaining a red‑black tree of monitored file descriptors and a ready list, allowing the kernel to return only active descriptors without scanning the entire set.

It supports both level‑triggered (LT) and edge‑triggered (ET) modes; LT repeatedly notifies while a descriptor remains ready, whereas ET notifies only on state changes, requiring the application to drain the socket.

The core workflow consists of three system calls: int epoll_create(int size) (or epoll_create1) creates an epoll instance,

int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event)

adds, modifies, or removes file descriptors, and

int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout)

blocks until events occur.

Internally, epoll creates an eventpoll structure containing a mutex, a wait queue, a red‑black tree (rbr) for all watched descriptors, and a doubly linked list (rdllist) for ready descriptors. Adding a descriptor allocates an epitem, registers a poll callback via poll_wait, and inserts the item into the tree.

When a monitored file becomes ready, the kernel invokes ep_poll_callback, which places the corresponding epitem onto the ready list and wakes any threads waiting in epoll_wait. The wait routine then copies events to user space, re‑queues items for LT mode, or leaves them for ET mode.

Common pitfalls include the “epoll thundering‑herd” problem, looped epoll nesting limits, and the need to monitor EPOLLOUT only after a write returns EAGAIN. Proper use of non‑blocking sockets and handling EPOLLONESHOT are also discussed.

A complete example demonstrates a TCP server that creates an epoll instance, registers the listening socket, accepts new connections, and processes client data using only epoll_wait, illustrating how a single thread can efficiently serve many concurrent clients.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux High concurrency I/O multiplexing epoll edge-triggered level-triggered

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.