Fundamentals 12 min read

How Zero‑Copy Techniques Supercharge Network I/O Performance

This article explains why traditional I/O interfaces rely on data copying, demonstrates the hidden overhead of read/write in a network server, and introduces zero‑copy methods such as mmap, sendfile, DMA Gather Copy, and splice to dramatically reduce copies and context switches for faster I/O.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How Zero‑Copy Techniques Supercharge Network I/O Performance

Why Traditional I/O Involves Data Copies

Most modern Internet services are I/O‑bound. Classic POSIX I/O APIs ( read, write) move data between user space and kernel space. The operating system acts as a router: the application hands data to the OS, the OS copies it into a kernel buffer, then copies it again to the hardware. Each copy incurs a user‑kernel context switch and consumes CPU cycles.

Network Server Example

A minimal server that reads a file and sends it over a socket typically uses:

read(fileDesc, buf, len);
write(socket, buf, len);

Under the hood this triggers four copies and four context switches: read causes a transition to kernel mode; the disk DMA writes data into a kernel buffer.

The kernel copies the data from its buffer to the user‑space buffer buf and returns to user mode. write causes another transition; the user buffer is copied into the network stack’s kernel buffer.

The kernel finally DMA‑copies the socket buffer to the NIC before returning to user mode.

Zero‑Copy Concept

If the user buffer is never inspected or modified, those copies are unnecessary. Zero‑copy I/O aims to keep data inside the kernel or move it directly between devices, reducing CPU involvement.

Zero‑Copy Strategies

Keep the data entirely in kernel space; the user process never sees it.

Bypass the kernel so user space talks directly to hardware.

When kernel‑user interaction is unavoidable, optimise the data‑exchange path.

Using mmap

mmap

maps a file directly into the process address space, eliminating the kernel‑to‑user copy. The code becomes:

buf = mmap(file, len);
write(socket, buf, len);

Context switches remain four, and the mapping introduces management overhead. mmap is beneficial only when the saved copy cost exceeds the cost of maintaining the mapping.

Using sendfile

The Linux sendfile system call copies data between two file descriptors completely inside the kernel, removing the user‑space buffer. Its prototype is:

#include <sys/sendfile.h>
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

Data flow:

DMA moves data from disk to a kernel buffer.

The kernel copies the data from that buffer to the socket buffer.

DMA transfers the socket buffer to the NIC.

This reduces data copies to two and context switches to two.

DMA Gather Copy

Traditional DMA requires a single contiguous source. DMA Gather Copy lets the NIC gather data from multiple non‑contiguous kernel buffers (e.g., a file buffer and a protocol header) and transmit them in one operation, eliminating the final kernel‑to‑socket copy and further reducing CPU work.

Using splice

splice

moves data between a file descriptor and a pipe, and then from the pipe to another descriptor (e.g., a socket) without copying the payload. The pipe only carries metadata about the buffers.

Typical usage for a network server:

int pipefd[2];
pipe(pipefd);
splice(fileFd, NULL, pipefd[1], NULL, len, SPLICE_F_MOVE);
splice(pipefd[0], NULL, socketFd, NULL, len, SPLICE_F_MOVE);

Because the data never touches user space, splice achieves zero‑copy without special hardware. Modern Linux implements sendfile on top of splice, preserving backward compatibility.

Takeaway

High‑performance I/O is achieved by minimising CPU participation and data copies. Depending on the scenario, developers can choose: mmap – removes one kernel‑to‑user copy. sendfile – eliminates both user‑space copies, leaving only kernel‑internal copies.

DMA Gather Copy – hardware‑assisted gathering of multiple buffers, removing the final copy. splice – pipe‑based zero‑copy between arbitrary file descriptors.

These techniques dramatically increase throughput for I/O‑intensive applications such as web servers, file servers, and message brokers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

linuxmmapsendfileZero Copynetwork performanceIO optimizationsplice
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.