Fundamentals 20 min read

Why CPU‑Bound I/O Slows You Down: DMA, Zero‑Copy, and PageCache Explained

This article explains how storage media performance, kernel‑user mode transitions, DMA, zero‑copy, and PageCache interact to affect I/O latency, why multiple data copies and context switches hurt throughput, and how techniques like mmap, sendfile, and asynchronous I/O can reduce overhead.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Why CPU‑Bound I/O Slows You Down: DMA, Zero‑Copy, and PageCache Explained

Storage Media Performance

The left side of the diagram shows storage media from disk to memory, while the right side visualizes their read/write speeds; the closer a medium is to the CPU, the faster its performance. Understanding these speeds highlights the appeal of zero‑copy for high‑performance I/O systems.

Kernel Mode vs. User Mode

Kernel mode (kernel space) can access all memory and control devices such as disks, NICs, and peripherals. The CPU can switch programs while in this mode.

User mode (user space) has restricted memory access; CPU resources are shared among programs.

These two privilege levels exist to isolate programs, prevent unauthorized memory access, and protect peripheral devices.

How Data Transfer Works Between Two Machines

When process a on computer A sends a file to process b on computer B, the I/O path involves four steps:

System call read copies data from disk to the kernel’s page cache (kernel mode).

The process copies data from the kernel page cache to its user‑space buffer (user mode).

The process copies data from the user buffer to the kernel’s socket buffer (kernel mode).

The kernel copies data from the socket buffer to the network (user mode).

This results in four context switches and four data copies, which become a bottleneck for large transfers.

Direct Memory Access (DMA)

Without DMA, the CPU must handle every byte transferred between disk and memory, blocking other work. DMA introduces a dedicated hardware controller that moves data between the device and memory without CPU involvement, reducing CPU load but not necessarily the number of copies.

Zero‑Copy Technique

Zero‑copy combines DMA with the kernel’s page cache to eliminate CPU‑mediated copies. The typical flow is:

Process issues a read system call; the kernel copies data into the page cache.

The kernel schedules a DMA transfer to move data from the disk controller to the page cache, freeing the CPU.

When the DMA finishes, it notifies the kernel, which then copies data from the page cache to the user buffer.

For sending, the kernel can move data directly from the page cache to the socket buffer using DMA, avoiding user‑space copies.

Zero‑copy reduces the operation to two data copies, both performed by DMA, roughly doubling transfer performance.

Zero‑Copy Implementations

Two common Linux mechanisms:

mmap + write : Replace read with mmap to map kernel pages directly into user space, then use write to send data from the mapped region to the socket.

sendfile : A kernel system call that copies data from a file descriptor directly to a socket descriptor, bypassing user space. When the network card supports SG‑DMA, the kernel can hand over buffers to the NIC, further reducing copies.

PageCache Basics

PageCache is the kernel’s disk cache residing in memory. Reads first check the cache (cache hit) to avoid disk access; misses trigger a disk read and populate the cache. Writes mark pages as dirty; the kernel periodically flushes dirty pages to disk based on parameters such as dirty_expire_centisecs and dirty_background_ratio.

Advantages:

Faster data access by keeping hot data in RAM.

Reduced disk I/O frequency, extending disk lifespan.

Higher overall I/O throughput thanks to read‑ahead (pre‑read) mechanisms.

Disadvantages:

Consumes additional RAM, potentially causing swap pressure.

Lacks a clean API for applications, leading some programs (e.g., MySQL InnoDB) to implement their own page management.

Large files can evict hot small files from the cache, degrading performance for those files.

PageCache Tuning

Key sysctl parameters: vm.dirty_background_ratio: Trigger write‑back when dirty pages exceed this percentage of total memory. vm.dirty_ratio: Stop new writes when dirty pages exceed this percentage. vm.dirty_expire_centisecs and vm.dirty_writeback_centisecs: Control time‑based flushing. vm.swappiness: Set to 0 to disable swap usage.

Adjust these values based on CPU count, memory size, storage type, and network bandwidth.

Large‑File Transfer Strategies

For massive files, asynchronous I/O combined with direct I/O is preferred over zero‑copy because it avoids PageCache entirely, reducing context switches and copies.

Typical flow for async I/O:

The kernel issues a non‑blocking read request to the disk; the CPU can continue other work.

When the disk completes, the kernel notifies the process, which then processes the data.

Direct I/O is suitable when the application already caches data or when transferring large files.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceI/ODMAZero CopyPageCache
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.