Fundamentals 15 min read

How DMA and Zero‑Copy Revolutionize Linux I/O Performance

This article explains the principles of Direct Memory Access (DMA), compares it with traditional I/O, details the DMA data‑transfer workflow, and explores zero‑copy techniques such as mmap + write and sendfile, highlighting how they reduce context switches, data copies, and improve overall Linux I/O efficiency.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How DMA and Zero‑Copy Revolutionize Linux I/O Performance

Direct Memory Access (DMA) offloads data movement from the CPU to a dedicated controller, allowing the CPU to perform other tasks while the controller transfers data between memory and devices.

Traditional I/O vs. DMA

Without DMA, every byte transferred between disk and memory requires the CPU to copy data, leading to multiple context switches and high CPU usage. Using DMA reduces the number of data copies and system calls, cutting the number of user‑kernel transitions.

DMA Transfer Process

The typical DMA‑based I/O flow is:

The user process calls read, causing the kernel to issue an I/O request and block the process.

The kernel forwards the request to the DMA controller.

The DMA controller commands the disk to read data into its internal buffer.

When the disk buffer is full, it interrupts the DMA controller.

The DMA controller copies the data from the disk buffer to the kernel buffer without CPU involvement.

After sufficient data is transferred, the DMA controller interrupts the CPU.

The CPU copies the data from the kernel buffer to user space, completing the system call and returning to user mode.

This workflow typically involves four data copies (two by the CPU) and four user‑kernel context switches.

Optimisation Strategies

To improve performance, the article suggests reducing the number of context switches and data copies. Since user space cannot directly access disks or NICs, system calls are required; minimizing their frequency is key.

Zero‑Copy Techniques

Two common zero‑copy approaches are:

mmap + write

sendfile

mmap + write

Using mmap maps a file directly into the process address space, eliminating the need for read to copy data from kernel to user space. The process then writes directly from the shared kernel buffer to the socket, with the CPU only moving data between kernel buffers and the NIC.

Performance impact: one fewer data copy compared with the traditional path, but still incurs two system calls and two context switches.

sendfile

The sendfile system call (available since Linux 2.1) replaces read and write, reducing one system call and two context switches. It transfers data from the kernel page cache to the socket buffer, and the NIC’s SG‑DMA moves the data to the network without CPU copying.

Resulting workflow: only two context switches and two data copies (disk → page cache → NIC), achieving at least a 2× speedup over traditional I/O.

2次数据拷贝,无CPU参与拷贝
1次系统调用
2 次用户态与内核态的上下文切换

Kernel Page Cache Role

The page cache stores recently accessed data, provides read‑ahead (pre‑fetch) based on spatial locality, and allows the kernel I/O scheduler to merge requests, reducing disk seek time. However, large files can fill the cache, displacing hot small files and degrading performance.

Large‑File Transfer Recommendation

For large files, zero‑copy is less effective because the kernel cache becomes a bottleneck. The article recommends combining asynchronous I/O with direct I/O, which bypasses the page cache entirely, avoiding unnecessary copies and context switches.

Conclusion

DMA and zero‑copy techniques dramatically reduce CPU involvement, context switches, and data copies in Linux I/O, improving throughput. Nevertheless, zero‑copy cannot be used when the application needs to process data in user space or when transferring very large files, where async + direct I/O is preferable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DMAmmapsendfileZero-CopyLinux I/Oasync I/Okernel cache
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.