Fundamentals 10 min read

Understanding Linux I/O: From Buffers to Disk Writes

This article provides a comprehensive overview of Linux I/O fundamentals, covering the layered I/O stack, buffer interactions, system call flow, scheduler algorithms, consistency and safety considerations, and performance characteristics, supplemented with code examples and illustrative diagrams.

360 Zhihui Cloud Developer

Jan 13, 2021

Understanding Linux I/O: From Buffers to Disk Writes

1. Introduction

Linux I/O is the foundation of file storage. This article summarizes basic Linux I/O concepts.

2. Linux I/O Stack

The Linux file I/O uses a layered design, providing clear architecture and functional decoupling.

When data is written, it passes through several buffers: the application buffer, the libc (standard I/O) buffer, the page cache, and finally the kernel buffer before reaching the disk.

Example code:

void foo() {
    char *buf = malloc(MAX_SIZE);
    strncpy(buf, src, MAX_SIZE);
    fwrite(buf, MAX_SIZE, 1, fp);
    fclose(fp);
}

After fwrite, the OS copies data from the application buffer to the libc buffer. fclose flushes the libc buffer to the page cache, but data may still be lost if the process exits before the kernel flushes it to disk. To ensure persistence, sync or fsync must be used. fflush only moves data from the libc buffer to the page cache.

3. I/O Call Chain

fwrite

is the highest‑level interface; it buffers data in user space and eventually invokes the write system call, causing a user‑to‑kernel transition. The data reaches the page cache, after which the kernel’s pdflush thread schedules it for writing to the I/O queue.

4. I/O Scheduler Layer

Tasks in the I/O queue are scheduled to maximize overall disk performance. Traditional algorithms (elevator, deadline) aim to reduce head movement on mechanical drives. SSDs lack moving parts, so the noop algorithm is often preferred.

5. Consistency and Safety

5.1 Safety

If a process exits, data in the application or libc buffer is lost; data already in the page cache survives. A kernel crash loses data not yet in the disk cache, and a power loss loses all data.

5.2 Consistency

Opening the same file multiple times in one process without O_APPEND causes writes to overlap because each file descriptor has its own file offset.

fd1 = open("file", O_RDWR|O_TRUNC);
fd2 = open("file", O_RDWR|O_TRUNC);
while (1) {
    write(fd1, "hello 
", 6);
    write(fd2, "world 
", 6);
}

Using O_APPEND shares the file length and updates each descriptor’s offset, preventing overwrites.

5.3 Read Process

The read path proceeds as: lib read → sys_read → VFS vfs_read / generic_file_read → page cache check → block layer → I/O scheduler → driver → DMA → disk → user buffer.

6. Performance Issues

Disk seek time (~10 ms) limits seeks to about 100‑200 per second. Rotational speed affects throughput; typical 15,000 rpm drives achieve 0‑50 MB/s sequential reads, while SSDs can reach 0‑400 MB/s.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance File System Linux I/O IO Stack

Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.