Traditional System Call I/O and High‑Performance Optimizations in Linux
This article explains how Linux traditional read/write system calls work, detailing the CPU, DMA copies and context switches involved, and then explores high‑performance I/O techniques such as zero‑copy, multiplexing and PageCache, together with the Linux I/O stack and buffering layers.
Traditional System Call I/O
In Linux, traditional I/O is performed via the read() and write() system calls, which together involve two CPU copies, two DMA copies and a total of four context switches.
Read Operation
When an application calls read() , if the requested data is already present in the process's page cache it is copied directly from memory; otherwise the kernel loads the data from disk into a kernel‑space read buffer and then copies it to the user‑space buffer, incurring one CPU copy, one DMA copy and two context switches.
read(file_fd, tmp_buf, len);The detailed read flow is:
User process invokes read() , causing a switch from user space to kernel space.
CPU uses the DMA controller to move data from main memory or disk into the kernel read buffer.
CPU copies data from the read buffer to the user‑space buffer.
Context switches back to user space and read() returns.
Write Operation
When an application calls write() , data is first copied from the user‑space buffer to the kernel socket buffer and then from the socket buffer to the NIC for transmission, resulting in two context switches, one CPU copy and one DMA copy.
The write flow mirrors the read flow:
User process invokes write() , switching to kernel space.
CPU copies data from the user buffer to the kernel socket buffer.
CPU uses DMA to move data from the socket buffer to the NIC.
Context switches back to user space and write() returns.
Network I/O
Network I/O follows the same pattern, with data moving through the kernel socket buffer before reaching the network device via DMA.
Disk I/O
Disk I/O involves the kernel reading or writing data to block devices, again using DMA to transfer between memory and the device.
High‑Performance Optimizations of I/O
Zero‑copy techniques
Multiplexing (I/O multiplexing)
PageCache usage
PageCache is the OS‑level cache for file contents; it reduces disk I/O by keeping file data in memory. When a read request arrives, the kernel first checks PageCache. If the data is present, it is returned directly; otherwise the kernel reads the required blocks from disk into PageCache (typically a few pages) and then serves the request.
For writes, data is first written to PageCache and marked as dirty. A background flusher thread periodically writes dirty pages back to disk when memory is low, dirty pages have been resident too long, or the application calls sync() / fsync() .
Storage Device I/O Stack
The Linux I/O stack can be viewed as three layers:
File‑system layer – copies user data into the file‑system cache and eventually flushes it down.
Block layer – manages block‑device I/O queues, performs request merging and scheduling.
Device layer – interacts with hardware via DMA to move data between memory and the device.
Mechanisms such as Buffered I/O, mmap , and Direct I/O map onto different points of this stack.
I/O Buffering
At the user‑space level, the C stdio library provides its own buffers to reduce the number of system calls. In the kernel, a buffer cache (often called PageCache) stores file data, while a lower‑level BufferCache stores raw block device data. PageCache is tied to the file system, whereas BufferCache deals with raw device blocks.
Understanding these layers and their interactions is essential for designing high‑performance I/O in Linux.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.