Fundamentals 16 min read

Understanding Linux I/O: Storage Hierarchy, Page Cache, and System Call Mechanisms

This article explains Linux storage hierarchy, the roles of user‑space and kernel caches, the three‑layer I/O stack, page‑cache synchronization policies, file‑operation atomicity, locking mechanisms, and performance testing techniques for HDD and SSD devices.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Understanding Linux I/O: Storage Hierarchy, Page Cache, and System Call Mechanisms

Introduction: The article begins by posing several questions about HDD vs. SSD differences, multithreaded file writes, whether a successful write guarantees data on disk, the atomicity of write , and the claimed speed advantage of mmap over traditional reads.

Memory hierarchy: It describes the storage pyramid where faster, more expensive memory (registers, CPU caches) sits above slower, cheaper layers (DRAM, local disk), and explains how locality of reference makes this hierarchy efficient.

Ubiquitous caching: The piece outlines user‑space stdio buffering, kernel Page Cache and Buffer Cache , and how these caches reduce costly system calls by aggregating I/O operations.

Linux I/O stack: A three‑layer kernel I/O stack is presented—filesystem layer, block layer (with I/O scheduling), and device layer (using DMA). It maps Buffered I/O, mmap , and Direct I/O onto these layers.

Page cache synchronization: The article contrasts write‑through and write‑back policies, defines dirty pages, explains the role of the pdflush kernel thread, and shows how sync , fsync , fdatasync or opening a file with O_SYNC enforce durability.

File operations and locking: It clarifies that write is not atomic, notes that only O_CREAT and O_APPEND are guaranteed atomic by the kernel, and discusses file locking mechanisms ( flock and fcntl ) and typical design choices.

Disk performance testing: Characteristics of HDDs (limited sequential throughput, high seek latency) and SSDs (high random I/O, high I/O depth) are compared, and the use of the fio tool for benchmarking is recommended.

Conclusion: Readers are encouraged to deepen their understanding of Linux I/O mechanisms, consider storage media characteristics when designing software, and explore the referenced materials for further study.

References: The article cites sources such as "Computer Systems: A Programmer's Perspective", "The Linux Programming Interface", Linux storage stack diagrams, and various online articles on O_DIRECT , O_SYNC , SSD coding, and fio benchmarking.

performanceI/OLinuxStoragepage cachesystem calls
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.