How Zero‑Copy Techniques Supercharge Network I/O Performance
This article explains why traditional I/O interfaces rely on data copying, demonstrates the hidden overhead of read/write in a network server, and introduces zero‑copy methods such as mmap, sendfile, DMA Gather Copy, and splice to dramatically reduce copies and context switches for faster I/O.
Why Traditional I/O Involves Data Copies
Most modern Internet services are I/O‑bound. Classic POSIX I/O APIs ( read, write) move data between user space and kernel space. The operating system acts as a router: the application hands data to the OS, the OS copies it into a kernel buffer, then copies it again to the hardware. Each copy incurs a user‑kernel context switch and consumes CPU cycles.
Network Server Example
A minimal server that reads a file and sends it over a socket typically uses:
read(fileDesc, buf, len);
write(socket, buf, len);Under the hood this triggers four copies and four context switches: read causes a transition to kernel mode; the disk DMA writes data into a kernel buffer.
The kernel copies the data from its buffer to the user‑space buffer buf and returns to user mode. write causes another transition; the user buffer is copied into the network stack’s kernel buffer.
The kernel finally DMA‑copies the socket buffer to the NIC before returning to user mode.
Zero‑Copy Concept
If the user buffer is never inspected or modified, those copies are unnecessary. Zero‑copy I/O aims to keep data inside the kernel or move it directly between devices, reducing CPU involvement.
Zero‑Copy Strategies
Keep the data entirely in kernel space; the user process never sees it.
Bypass the kernel so user space talks directly to hardware.
When kernel‑user interaction is unavoidable, optimise the data‑exchange path.
Using mmap
mmapmaps a file directly into the process address space, eliminating the kernel‑to‑user copy. The code becomes:
buf = mmap(file, len);
write(socket, buf, len);Context switches remain four, and the mapping introduces management overhead. mmap is beneficial only when the saved copy cost exceeds the cost of maintaining the mapping.
Using sendfile
The Linux sendfile system call copies data between two file descriptors completely inside the kernel, removing the user‑space buffer. Its prototype is:
#include <sys/sendfile.h>
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);Data flow:
DMA moves data from disk to a kernel buffer.
The kernel copies the data from that buffer to the socket buffer.
DMA transfers the socket buffer to the NIC.
This reduces data copies to two and context switches to two.
DMA Gather Copy
Traditional DMA requires a single contiguous source. DMA Gather Copy lets the NIC gather data from multiple non‑contiguous kernel buffers (e.g., a file buffer and a protocol header) and transmit them in one operation, eliminating the final kernel‑to‑socket copy and further reducing CPU work.
Using splice
splicemoves data between a file descriptor and a pipe, and then from the pipe to another descriptor (e.g., a socket) without copying the payload. The pipe only carries metadata about the buffers.
Typical usage for a network server:
int pipefd[2];
pipe(pipefd);
splice(fileFd, NULL, pipefd[1], NULL, len, SPLICE_F_MOVE);
splice(pipefd[0], NULL, socketFd, NULL, len, SPLICE_F_MOVE);Because the data never touches user space, splice achieves zero‑copy without special hardware. Modern Linux implements sendfile on top of splice, preserving backward compatibility.
Takeaway
High‑performance I/O is achieved by minimising CPU participation and data copies. Depending on the scenario, developers can choose: mmap – removes one kernel‑to‑user copy. sendfile – eliminates both user‑space copies, leaving only kernel‑internal copies.
DMA Gather Copy – hardware‑assisted gathering of multiple buffers, removing the final copy. splice – pipe‑based zero‑copy between arbitrary file descriptors.
These techniques dramatically increase throughput for I/O‑intensive applications such as web servers, file servers, and message brokers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
