Why Linux Treats Everything as a File: A Deep Dive into Kernel File System Architecture
This article explains the core philosophy of Linux’s “everything is a file” design, walks through the kernel’s VFS layer, inode, dentry, superblock, logical blocks, and specific file systems, and provides hands‑on examples—including procfs initialization and read/write code—to help readers master Linux kernel file system internals.
1. Linux Kernel File System Overview
The Linux kernel treats all resources—regular files, device nodes under /dev, virtual files under /proc, processes, and network sockets—as files. This uniform abstraction simplifies programming and system management.
1.1 Kernel Architecture
The kernel integrates process scheduling, memory management, file systems, networking, and device drivers in a single address space, allowing modules to call each other directly for high performance.
1.2 File System Role
A file system organizes data on storage devices, decides placement (contiguous or fragmented), builds a hierarchical directory tree, and enforces permission checks.
1.3 "Everything is a file" Stack
User‑space layer: commands ( ls, cat) and system calls ( open, read) interact with the kernel.
VFS abstraction layer: hides differences between concrete file systems and presents a unified API.
Physical file‑system layer: concrete implementations such as ext4, XFS, Btrfs, or virtual file systems like /proc.
2. Core Data Structures
2.1 Inode
An inode stores metadata (size, permissions, timestamps, block pointers) and uniquely identifies a file. The file name is merely a reference to an inode.
2.2 Dentry (Directory Entry)
Dentries map file names to inodes and are cached in the dcache to accelerate path lookups. Hard links are multiple dentries pointing to the same inode.
2.3 Superblock
The superblock holds global file‑system information: type, total/free inode counts, total/free block counts, and timestamps. Layout varies slightly between file‑system types.
2.4 Logical Block
Logical blocks (commonly 4 KB) group physical sectors to improve I/O efficiency. For a 10 KB file, three logical blocks are used instead of twenty 512‑byte sectors.
3. Virtual File System (VFS) Layer
VFS provides a single API for user programs regardless of the underlying file system (ext4, XFS, NFS, procfs, etc.). Core VFS structures are the superblock, inode, and file object.
3.1 Common Linux File Systems
ext4 – successor of ext3 with larger capacity, delayed allocation, and extents.
XFS – high‑performance journaling file system optimized for large files and parallel I/O.
Btrfs – copy‑on‑write file system offering snapshots, checksums, and compression.
Each implements the VFS operation set (read, write, open, etc.) to translate generic calls into device‑specific actions.
4. Implementation Details
4.1 File‑system Registration (ext4 example)
During boot the kernel registers the file‑system type with register_filesystem, reads the superblock from a fixed disk location, and initializes block‑group descriptors.
4.2 File Creation and Deletion
Creation allocates an inode, reserves data blocks, and adds a directory entry linking the name to the inode. Deletion frees the inode and blocks and removes the directory entry.
4.3 Read/Write Operations
Read: VFS locates the inode, follows its block pointers, and copies data into the page cache. Write: Data is first placed in the page cache (delayed write) and later flushed to disk as dirty pages.
4.4 Directory Operations
Creating a directory allocates an inode and data blocks to store child entries. Deletion requires the directory to be empty before its inode and blocks are reclaimed.
5. Case Study – procfs
5.1 Initialization
static int __init proc_root_init(void) {
proc_root = proc_mkdir_deprecated(NULL, NULL, "proc",
S_IFDIR | S_IRUGO | S_IXUGO);
return 0;
}
fs_initcall(proc_root_init);The function creates the /proc root directory and registers it to run during kernel initialization.
5.2 File Operations
ssize_t proc_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos) {
struct inode *inode = file->f_path.dentry->d_inode;
struct proc_dir_entry *de = PROC_I(inode)->pde;
if (de->read_proc)
return de->read_proc(file, buf, count, ppos);
else if (de->get_info)
/* handle get_info */
else
/* default read logic */
return 0;
}
ssize_t proc_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos) {
struct inode *inode = file->f_path.dentry->d_inode;
struct proc_dir_entry *de = PROC_I(inode)->pde;
if (de->write_proc)
return de->write_proc(file, buf, count, ppos);
return -EINVAL; /* most proc files are read‑only */
}Read/write functions dispatch to per‑file callbacks defined in proc_dir_entry. For example, /proc/cpuinfo generates CPU details on‑the‑fly, while /proc/sys/vm/swappiness accepts writes to adjust kernel parameters.
5.3 Data Generation
Files such as /proc/cpuinfo and /proc/meminfo pull data from kernel structures ( cpuinfo_x86, meminfo) each time they are read, providing up‑to‑date system state.
5.4 Directory Layout
Each process has a numeric subdirectory under /proc (e.g., /proc/1234) containing cmdline, status, and fd entries that expose command‑line arguments, memory usage, and open file descriptors.
5.5 Kernel Parameter Tuning via procfs
Writing to /proc/sys/net/ipv4/ip_forward enables/disables IP forwarding; writing to /proc/sys/vm/swappiness changes the tendency to swap memory. Changes may take effect immediately or after a reboot; incorrect values can destabilize the system.
6. Practical Commands and Examples
View CPU information: cat /proc/cpuinfo View memory information: cat /proc/meminfo List processes: ls /proc (numeric entries are PIDs). Inspect a process: cat /proc/$$/status Inspect device nodes: ls -l /dev (character c vs block b devices).
Show inode number of a file: ls -i filename Display full inode metadata: stat filename Mount a file system: mount -t ext4 /dev/sda1 /mnt/data Unmount: umount /mnt/data (ensure no processes are using the mount point).
7. Key Takeaways
Understanding the “everything is a file” model, the VFS abstraction, and the core data structures (inode, dentry, superblock) enables developers to write portable code, debug kernel‑level issues, and safely tune system parameters via procfs. The case study of procfs illustrates how a virtual file system is initialized, how file operations are dispatched, and how dynamic data is generated on demand.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
