How DMA and Zero‑Copy Boost Linux I/O Performance
This article explains how DMA and zero‑copy techniques reduce the four memory copies and context switches typical of Linux I/O, detailing their mechanisms, implementations such as sendfile, mmap and Direct I/O, and real‑world usage in Kafka and MySQL to boost performance.
DMA and Zero‑Copy Technology
Memory copying is a time‑consuming operation; zero‑copy is a common optimization. This article discusses Linux zero‑copy techniques, which are used by Kafka and MySQL.
Note: Except for Direct I/O, disk‑related file operations use page cache.
1. Four Copies and Four Context Switches
Typical client requests involve the following system calls:
File.read(file, buf, len);
Socket.send(socket, buf, len);
For example, Kafka reads a batch of messages from disk and writes them directly to the NIC.
Without optimization, the OS performs four data copies and four context switches:
The four copies are:
CPU moves data from disk to kernel space page cache.
CPU moves data from kernel socket buffer to the network.
CPU moves data from kernel page cache to user‑space buffer.
CPU moves data from user‑space buffer to kernel socket buffer.
The four context switches are:
Read system call: user → kernel.
Read return: kernel → user.
Write system call: user → kernel.
Write return: kernel → user.
2. DMA’s Role in Reducing Copies
DMA (Direct Memory Access) uses a dedicated controller (DMAC) to transfer data between memory and I/O devices without CPU involvement, acting as a co‑processor.
DMAC is valuable when transferring large or fast data streams, or very slow streams where it can wait for data to be ready before notifying the CPU.
Now DMA replaces the CPU for memory‑disk and memory‑NIC transfers, while the CPU only controls the DMA.
DMA cannot handle internal memory copies within a device; the CPU still manages kernel‑space to user‑space copies.
3. Zero‑Copy Techniques
3.1 What Is Zero‑Copy?
Zero‑copy means the CPU does not need to copy data from one memory region to another; it only manages the transfer. The data may still be copied once (e.g., from disk to memory) but the CPU is not fully responsible.
sendfile
mmap
splice
Direct I/O
Different techniques suit different scenarios.
DMA review: DMA handles memory‑to‑device copies, CPU only issues control signals.
Zero‑copy using page cache:
sendfile: replaces read/write with a single system call, using DMA and file‑descriptor passing.
mmap: maps kernel pages into user space, allowing writes directly to kernel memory.
Zero‑copy without page cache: Direct I/O transfers data directly between user space and disk/NIC via DMA.
3.2 sendfile
sendfile is ideal when data read from disk is sent over the network without processing (e.g., message queues).
It reduces the four copies and four context switches to two copies and two context switches by leveraging DMA and passing file descriptors.
Passing file descriptors avoids an extra kernel‑space copy, but it requires NICs that support SG‑DMA.
sendfile performs only one system call, cutting context switches from four to two.
Limitation: if the application needs to modify the data (e.g., encryption), sendfile cannot be used because the data never reaches user space.
3.3 mmap
See a dedicated article for a detailed discussion of mmap.
3.4 Direct I/O
Direct I/O bypasses the page cache, transferring data directly between user space and the storage device.
Cached I/O: data passes through page cache.
Direct I/O: data goes straight to disk, avoiding kernel buffers.
Advantages include reduced kernel overhead and higher throughput for large transfers. Disadvantages involve the need for page pinning, potential slow reads from disk, and added complexity for the application.
Typical users are self‑caching applications such as database management systems (e.g., MySQL) that manage their own caches.
Self‑caching applications keep data in their own address space and may share memory across hosts, requiring mechanisms to invalidate cached data.
To use Direct I/O, applications must allocate user‑space buffers and avoid frequent reads/writes in performance‑critical paths.
4. Typical Cases
4.1 Kafka
Persistence: Kafka uses mmap (java.nio.MappedByteBuffer) for efficient sequential disk I/O.
Message delivery: Kafka uses sendfile, which avoids kernel‑user copies and benefits from page cache when multiple consumers read the same data.
4.2 MySQL
MySQL’s zero‑copy implementation is more complex; see the author’s other article for details.
5. Summary
DMA allows the CPU to issue control signals while the DMA controller performs the actual data movement, reducing CPU involvement.
Linux zero‑copy strategies can be grouped into:
Eliminate or reduce user‑kernel copies : use system calls like mmap, sendfile, splice.
Bypass kernel with Direct I/O : user‑space communicates directly with hardware via DMA.
Optimize kernel‑user buffer transfers : improve traditional copy paths.
Example code snippet illustrating page cache usage:
Page CacheSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
