Fundamentals 14 min read

How DMA and Zero‑Copy Boost I/O Performance: A Deep Dive

This article explains Direct Memory Access (DMA), compares it with traditional I/O, details the complete DMA‑based I/O workflow, and explores zero‑copy techniques such as mmap, sendfile, and asynchronous I/O, highlighting their impact on context switches, data copies, and overall transfer performance.

Open Source Linux
Open Source Linux
Open Source Linux
How DMA and Zero‑Copy Boost I/O Performance: A Deep Dive

Direct Memory Access (DMA) transfers data without CPU involvement, allowing the CPU to perform other tasks.

Traditional I/O requires the CPU to copy data, consuming significant CPU resources.

Using DMA reduces the number of data copies and system calls, decreasing user‑kernel context switches.

DMA controller workflow:

User process calls read, entering blocked state and switching to kernel mode.

OS forwards the I/O request to DMA, freeing the CPU.

DMA sends the request to the disk.

Disk reads data into its controller buffer and signals DMA when full.

DMA copies data from the disk controller buffer to the kernel buffer without CPU involvement.

After sufficient data is read, DMA interrupts the CPU.

CPU copies data from the kernel buffer to user space, returning from the system call.

Optimizations focus on reducing context switches and data copies.

How to reduce the number of user‑kernel context switches?

System calls are needed because user space cannot directly access disks or network cards; the kernel must handle these operations.

Reducing context switches means reducing system call frequency.

How to reduce the number of data copies?

Data often moves from kernel read buffer to user buffer to socket buffer unnecessarily; zero‑copy techniques aim to eliminate these extra copies.

Zero‑copy methods include:

mmap + write

sendfile

mmap maps a file directly into the process address space, allowing the kernel and user space to share the same memory without additional copies.

The process calls mmap, the kernel shares the file’s pages with the process, and subsequent write copies data from the kernel buffer directly to the socket buffer, all in kernel mode.

Performance impact:

Reduces one data copy.

Still requires CPU to copy data from kernel buffer to socket buffer.

Two system calls cause two context switches.

sendfile replaces read() and write(), reducing one system call and two context switches, and copies data directly from kernel buffer to socket buffer, achieving true zero‑copy.

Later kernel versions use DMA to move data from disk to kernel buffer and then from kernel buffer to NIC buffer via SG‑DMA, reducing data copies to two.

Zero‑copy eliminates CPU involvement in data movement, requiring only two context switches and two data copies, roughly doubling file transfer performance.

Kernel cache (page cache) stores recently accessed data, provides read‑ahead, and enables I/O scheduling and merging to improve disk I/O performance.

However, kernel cache is unsuitable for large file transfers because it can be filled with large files, preventing hot small files from benefiting.

For large files, use asynchronous I/O combined with direct I/O, bypassing the kernel cache.

Asynchronous I/O does not involve the kernel cache; direct I/O transfers data directly between disk and user space.

In summary, DMA and zero‑copy techniques reduce CPU load, context switches, and data copies, significantly improving I/O performance, while large‑file transfers may require async + direct I/O instead of zero‑copy.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

linuxDMAI/O optimizationmmapsendfileZero-Copy
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.