Fundamentals 18 min read

Storage Media Performance, Kernel/User Mode, DMA, Zero‑Copy, and PageCache

The article explains how different storage media affect I/O speed, describes kernel and user mode separation, introduces DMA and zero‑copy techniques such as mmap + write and sendfile, and discusses PageCache behavior, advantages, drawbacks, and tuning for high‑performance file transfers.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Storage Media Performance, Kernel/User Mode, DMA, Zero‑Copy, and PageCache

This article starts with a diagram of storage media performance, showing that the closer a medium is to the CPU (register → cache → memory → disk), the higher its read/write speed, and emphasizes the importance of zero‑copy techniques for high‑performance I/O.

Kernel mode and user mode are explained: kernel mode has full access to hardware and memory, while user mode has restricted access. The separation protects resources and prevents programs from interfering with each other.

The data transfer process between two processes on different machines is illustrated, highlighting four context switches (user→kernel for read, kernel→user after read, user→kernel for write, kernel→user after write) and multiple data copies, which degrade performance.

To reduce CPU involvement, Direct Memory Access (DMA) is introduced. DMA allows the disk controller to move data directly between its buffer and main memory without CPU intervention, freeing the CPU for other tasks.

The DMA workflow is described step‑by‑step: the process issues a read system call, the kernel forwards the request to DMA, DMA contacts the disk controller, the controller signals completion via interrupt, DMA copies data to kernel buffers, and finally the CPU copies data to user space.

Zero‑copy techniques are then presented:

mmap + write : replaces the read system call with mmap, mapping kernel buffers directly into user space, reducing one copy.

sendfile() : a kernel system call that copies data from a file descriptor to a socket descriptor without passing through user space, cutting one system call and two context switches.

When the network card supports scatter‑gather DMA, sendfile can further reduce copies, achieving only two memory copies performed entirely by DMA.

The article then discusses PageCache , the kernel’s disk cache:

Read cache: data is served from memory if present (cache hit), otherwise read from disk and cached.

Write cache: writes go to the cache first; dirty pages are flushed to disk based on time or memory‑usage thresholds.

Advantages: faster data access, reduced disk I/O, higher throughput.

Disadvantages: consumes physical memory, can cause cache eviction of hot small files, and lacks a clean API for applications.

Typical PageCache tuning parameters (e.g., vm.dirty_background_ratio , vm.dirty_ratio , vm.dirty_expire_centisecs , vm.dirty_writeback_centisecs , vm.swappiness ) are listed with brief guidance.

For large‑file transfers, the article recommends using asynchronous I/O combined with direct I/O to avoid PageCache and minimize context switches, as zero‑copy alone may still involve multiple copies.

Overall, the piece provides a comprehensive overview of storage‑level performance, kernel‑user interactions, DMA, zero‑copy, and PageCache, offering practical advice for optimizing I/O in Linux systems.

performancekernelI/ODMAZero CopyStoragepage cache
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.