How Zero‑Copy Techniques Slash Data Transfer Overhead in Linux
This article explains why traditional read/write file serving incurs multiple data copies and context switches, then introduces mmap, sendfile, and splice as zero‑copy methods that reduce CPU load, discuss their usage, limitations, and practical pitfalls for high‑performance Linux servers.
When a server program sends a file to a client using the classic
while ((n = read(diskfd, buf, BUF_SIZE)) > 0) { write(sockfd, buf, n); }loop, the data is copied several times: from disk to kernel buffers, kernel to user space, user space back to kernel for the socket, and finally from kernel to the NIC, causing four copies and many context switches.
What is Zero‑Copy?
Zero‑copy aims to eliminate unnecessary copies between user space and kernel space, allowing the CPU to focus on other work and improving overall throughput.
Using mmap
Replacing read with mmap maps the file directly into the process address space:
buf = mmap(diskfd, len);
write(sockfd, buf, len);This removes one copy because the kernel shares the page cache with the process. However, mmap introduces pitfalls: if another process truncates the file, a SIGBUS is raised, terminating the program. Common mitigations are installing a SIGBUS handler or using a file lease (via fcntl with F_SETLEASE) so the kernel notifies the process before the truncation.
Using sendfile
Since Linux 2.1, sendfile transfers data directly between a file descriptor and a socket descriptor without copying to user space:
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);The call reduces copies and context switches because the data never leaves kernel space. If the source file is truncated during the call, sendfile returns the number of bytes already transferred and sets errno. Using a lease on the file yields an RTSIG_LEASE signal before truncation.
Using splice
Linux 2.6.17 added splice, which moves data between two file descriptors via a pipe without touching user space:
ssize_t splice(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags);At least one descriptor must refer to a pipe. Flags such as SPLICE_F_MOVE, SPLICE_F_NONBLOCK, and SPLICE_F_MORE control behavior. This method further reduces copies but is limited by the pipe requirement.
Other Zero‑Copy Approaches
Modern kernels can combine sendfile with DMA so that the kernel passes a buffer descriptor to the NIC, avoiding the final copy to the socket buffer. Additional techniques include using O_DIRECT for direct I/O, copy‑on‑write (COW) to avoid copying when data is unchanged, and experimental fbufs. These are mentioned for completeness but not detailed.
Overall, mmap, sendfile, and splice each reduce the number of data copies and context switches in different scenarios, helping high‑performance services achieve lower CPU usage and higher throughput.
Source: 简书, author 卡巴拉的树 https://www.jianshu.com/p/fad3339e3448
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
