Understanding Zero‑Copy Techniques: DMA, sendfile, mmap and Direct I/O
This article explains how zero‑copy techniques such as DMA, sendfile, mmap and Direct I/O reduce data copies and context switches in Linux, compares their mechanisms, advantages and drawbacks, and shows typical use cases like Kafka for high‑performance I/O.
When a client request triggers file or socket operations, the traditional path involves four data copies and four context switches between user and kernel space, leading to poor performance.
DMA (Direct Memory Access) introduces a co‑processor that moves data between memory and I/O devices without CPU involvement, cutting the number of copies and context switches.
Zero‑copy is the concept where the CPU does not perform the full data movement; instead, the data may still be copied, but the CPU only orchestrates the transfer.
Common Linux zero‑copy implementations include:
sendfile : combines DMA with file‑descriptor passing to replace separate read/write calls, reducing copies from four to two and context switches from four to two.
mmap : maps kernel page cache into user space, eliminating the read copy; writes operate directly on the mapped region.
Direct I/O : bypasses the page cache entirely, transferring data directly between user buffers and storage/network via DMA; it guarantees that data reaches the device but requires page pinning and may need an explicit fsync for metadata.
Each technique has trade‑offs: sendfile cannot be used when the data must be processed before sending; Direct I/O incurs overhead for page pinning and loses cache benefits; mmap requires careful management of mapped regions.
Typical real‑world usage includes Kafka, which uses mmap for persisting logs and sendfile for delivering messages to consumers, achieving high throughput by avoiding unnecessary copies.
In summary, Linux provides multiple zero‑copy strategies—reducing or eliminating user‑kernel data copies, bypassing kernel buffers, or optimizing buffer transfers—to improve I/O performance, each suited to different workloads and hardware capabilities.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.