What Really Happens When You Read One Byte? Inside the Linux I/O Stack
This article demystifies the Linux I/O stack by tracing a single‑byte read through system calls, VFS, page cache, filesystem layers, block management and I/O scheduling, revealing when real disk I/O occurs and how much data is actually transferred.
1. Overview of the Linux I/O Stack
The article begins with a simple code snippet that reads one byte from a file and asks whether this triggers disk I/O and how large that I/O might be. To answer, it dissects the Linux I/O stack, showing that a seemingly trivial read traverses many kernel components.
1.1 I/O Engine
At the user‑level, functions like read, write, pread, and pwrite select a Linux I/O engine. The example uses the synchronous engine, which relies on lower‑level kernel services such as system calls, VFS, and the generic block layer.
1.2 System Call
Entering a system call moves execution from user space into the kernel, where the call packages the request for the VFS layer.
1.3 VFS (Virtual File System)
VFS abstracts different concrete filesystems behind a uniform API. Its core data structures— superblock, inode, dentry, and file —are defined in include/linux/fs.h and include/linux/dcache.h. Operations on these structures are exposed via inode_operations and file_operations function pointers.
// include/linux/fs.h
struct file {
...
const struct file_operations *f_op;
};
struct file_operations {
...
ssize_t (*read)(struct file *, char __user *, size_t, loff_t *);
ssize_t (*write)(struct file *, const char __user *, size_t, loff_t *);
int (*mmap)(struct file *, struct vm_area_struct *);
int (*open)(struct inode *, struct file *);
int (*flush)(struct file *, fl_owner_t id);
...
};1.4 Page Cache
The page cache (or page cache) is a pure‑memory cache that stores recently accessed file data. If the requested block is already in the cache, the kernel copies data directly to user space without touching the disk. Otherwise a new page is allocated, a page‑fault is generated, and the block is read from the disk to fill the cache.
1.5 Filesystem Implementation
Linux supports many filesystems (ext2/3/4, XFS, ZFS, …). Each filesystem provides concrete implementations of VFS operations. For example, ext4 defines ext4_file_operations where read maps to do_sync_read and write maps to do_sync_write.
const struct file_operations ext4_file_operations = {
.llseek = ext4_llseek,
.read = do_sync_read,
.write = do_sync_write,
.aio_read = generic_file_aio_read,
.aio_write = ext4_file_write,
...
};1.6 Generic Block Layer
The generic block layer offers a uniform interface for filesystems to interact with block devices, abstracting away hardware differences. It creates bio structures (defined in include/linux/bio.h) to represent I/O requests.
1.7 I/O Scheduler
Before reaching the device, I/O requests pass through the scheduler, which reorders them to improve throughput. Traditional HDDs use elevator‑style algorithms (e.g., deadline), while SSDs often use the simple noop scheduler. The scheduler ultimately issues DMA transfers of N sectors (typically 512 bytes each) to memory.
2. The File‑Read Process in Linux (Kernel 3.10)
A long diagram (omitted here) walks through the entire path from user‑space read to the disk, showing each component’s role. The process demonstrates that even a request for a single byte triggers a cascade of operations across the stack.
3. Revisiting the Opening Questions
3.1 Does reading one byte cause disk I/O?
If the data is already in the page cache, no disk I/O occurs; the kernel simply copies from memory to user space. Only when the cache misses (or when DIRECT_IO is used) does the kernel issue a real disk read.
3.2 If disk I/O occurs, how much data is transferred?
The kernel works with larger units than a single byte:
Page cache operates in pages (typically 4 KB).
Filesystems manage blocks, usually 4 KB as reported by dumpe2fs.
The generic block layer handles segments that are a page or part of a page.
I/O schedulers transfer sectors (commonly 512 bytes) via DMA.
Physical disks also use 512‑byte sectors.
Consequently, a request for one byte ultimately results in reading at least one sector (512 bytes) and often an entire page (4 KB) from the disk.
4. Final Thoughts
The operating system abstracts these complexities, presenting a simple API while performing extensive behind‑the‑scenes work to optimize performance. Understanding this stack helps developers diagnose performance bottlenecks and appreciate why a seemingly cheap read can be fast—or why cache misses may still involve substantial I/O.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
