Mastering Files, File Descriptors, and I/O Multiplexing in Linux
This article explains Linux file concepts, how file descriptors work, why simple blocking I/O fails at scale, and introduces I/O multiplexing techniques—including select, poll, and epoll—detailing their mechanisms, limitations, and practical code examples for high‑concurrency servers.
Before diving into I/O multiplexing, we review the notion of a file and a file descriptor in Linux.
What is a file?
In Linux a file is simply a sequence of N bytes (b1, b2, …, bN). All I/O devices—disk, network sockets, terminals, pipes—are abstracted as files, so the same read/write interface works for any device.
Common file‑related system calls are:
open – open a file
seek – change read/write offset
read / write – transfer data
close – close the file
File descriptor
A file descriptor is just an integer that the kernel uses to refer to an opened file. The process does not need to know where the file resides on disk or how it is buffered; the kernel handles those details.
int fd = open(file_name); // obtain file descriptor
read(fd, buff);When a network connection is accepted, accept() returns a descriptor that can be used to read from and write to the client.
// accept a client connection
int conn_fd = accept(...);
if (read(conn_fd, request_buff) > 0) {
do_something(request_buff);
}Why simple blocking I/O does not scale
If a process blocks on a read from one descriptor, it cannot handle other ready descriptors, which is unacceptable for servers that must serve thousands of clients simultaneously.
Creating a thread per client avoids blocking but incurs huge overhead when the number of connections grows.
I/O multiplexing concept
Instead of actively polling each descriptor, the process hands a set of descriptors to the kernel and asks to be notified when any become readable or writable. This “passive” approach lets the kernel monitor many descriptors efficiently.
Linux I/O multiplexing mechanisms
Linux provides three system calls for this purpose:
select
poll
epoll
select
Copies the descriptor set into kernel space (limited to 1024 descriptors) and returns when at least one descriptor is ready, but the application must scan the set to find which one.
poll
Similar to select but removes the 1024‑descriptor limit; still suffers from linear scanning overhead.
epoll
Uses an event‑driven model: the process registers interest with epoll_ctl, the kernel tracks only changed descriptors in shared memory, and when an event occurs the kernel wakes the process with the ready descriptor list, eliminating the need for full scans and copying.
Because of its efficiency, epoll has become the de‑facto standard for high‑concurrency servers on Linux.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
