Fundamentals 39 min read

Unlocking Linux Kernel File Systems: From Inodes to VFS and ProcFS Explained

This comprehensive guide explores the Linux kernel file system architecture, covering fundamental concepts such as inodes, dentries, superblocks, logical blocks, the VFS layer, common on‑disk filesystems, mounting procedures, and a deep dive into the proc virtual filesystem with code examples and practical usage tips.

Deepin Linux
Deepin Linux
Deepin Linux
Unlocking Linux Kernel File Systems: From Inodes to VFS and ProcFS Explained

1. Linux Kernel File System Basics

1.1 Linux Kernel Overview

The Linux kernel acts as the brain of the operating system, integrating process scheduling, memory management, file systems, networking, and device drivers into a single address space, which enables efficient inter‑module communication and high performance.

1.2 Definition and Role of a File System

A file system is the mechanism that organizes data on storage devices. It determines how data is stored (contiguous or fragmented), builds a hierarchical directory tree, and enforces permission checks, thereby providing reliable data storage, fast lookup, and security.

2. Core Principles of Linux Kernel File Systems

2.1 Inode

An inode is the core metadata structure for a file. When a file is created, the kernel assigns a unique inode number that stores size, permissions, owner IDs, timestamps, and pointers to the actual data blocks. The inode is independent of the file name; renaming a file does not change its inode.

2.2 Dentry (Directory Entry)

A dentry links a file name to its inode and lives in the in‑memory dcache. It speeds up path resolution by caching each component of a pathname. Multiple dentries can point to the same inode, forming hard links.

2.3 Superblock

The superblock holds global metadata for a filesystem, such as its type (ext4, XFS, etc.), total and free inode counts, total and free block counts, and timestamps. During mount, the kernel reads the superblock to learn how to manage the filesystem.

2.4 Logical Block

Logical blocks are the filesystem's unit of allocation, typically 4 KB, composed of several physical sectors. Grouping sectors into larger logical blocks reduces I/O overhead and abstracts away hardware sector size differences.

2.5 Virtual File System (VFS)

VFS provides a uniform API for user‑space programs, abstracting the details of underlying concrete filesystems (ext4, XFS, NFS, proc, sysfs, etc.). Calls such as open, read, and write are routed through VFS, which then dispatches them to the appropriate filesystem implementation.

3. Implementation Mechanisms

3.1 Filesystem Initialization (ext4 example)

During boot, ext4 registers itself via register_filesystem. The kernel then reads the superblock from a predefined disk location, populates an in‑memory superblock structure, and initializes block‑group descriptors that map inode tables and data block bitmaps.

3.2 File Creation and Deletion

Creating a file allocates a free inode, initializes its metadata, allocates data blocks, and adds a new dentry to the parent directory's inode. Deleting a file releases the inode and data blocks back to the free pools and removes the corresponding dentry.

3.3 File Read/Write Operations

Read operations locate the inode, follow its block pointers, and copy data into the page cache before returning it to user space. Write operations update inode metadata, write data to the page cache (delayed write), and later flush dirty pages to disk.

3.4 Directory Operations

Creating a directory allocates an inode and data blocks to store child dentries. Deleting a directory requires it to be empty; the kernel then frees its inode and blocks and removes the dentry from the parent.

4. Practical Filesystem Types

4.1 Common Linux Filesystems

ext4 : Widely used, stable, good at handling small files, supports fsck for recovery, but may be slower with very large files and requires unmounting for online resizing.

XFS : High‑performance, excels with large files and concurrent I/O, uses B+‑tree indexing and delayed allocation, supports online resizing, but lacks fsck and is not recommended as a root filesystem on many distributions.

Btrfs : Newer copy‑on‑write filesystem offering snapshots, compression, subvolume management, and self‑repair, suitable for data centers, though its stability is still catching up to ext4/XFS.

4.2 Mounting and Unmounting

Mounting attaches a filesystem to a directory (mount point) using mount -t <type> <device> <mountpoint>. The kernel reads the superblock to obtain metadata. Unmounting with umount requires no processes to be using the target; lsof can identify lingering users.

5. Case Study: The proc Virtual Filesystem

5.1 Initialization

static int __init proc_root_init(void) {
    proc_root = proc_mkdir_deprecated(NULL, NULL, "proc", S_IFDIR | S_IRUGO | S_IXUGO);
    return 0;
}
fs_initcall(proc_root_init);

The function creates the root /proc directory with read/execute permissions for all users and registers it to run during kernel initialization.

5.2 File Operations

ssize_t proc_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) {
    struct inode *inode = file->f_path.dentry->d_inode;
    struct proc_dir_entry *de = PROC_I(inode)->pde;
    if (de->read_proc)
        return de->read_proc(file, buf, count, ppos);
    else if (de->get_info) {
        // handle get_info case
    } else {
        // default read logic
    }
}
ssize_t proc_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) {
    struct inode *inode = file->f_path.dentry->d_inode;
    struct proc_dir_entry *de = PROC_I(inode)->pde;
    if (de->write_proc)
        return de->write_proc(file, buf, count, ppos);
    else {
        // default write logic, usually returns error
    }
}

Read/write functions retrieve the associated proc_dir_entry via the inode, then invoke the file‑specific callbacks ( read_proc, write_proc) if present. Most proc files are read‑only; write callbacks are used for tunable kernel parameters.

5.3 Data Generation

Proc files generate their content on‑the‑fly. For example, /proc/cpuinfo traverses struct cpuinfo_x86 to format CPU model, cores, and frequency, while /proc/meminfo reads struct meminfo to report total, free, cached, and swap memory.

5.4 Directory and File Organization

Each process has a numeric PID directory under /proc. Inside, files such as cmdline, status, and the fd subdirectory expose command‑line arguments, memory usage, UID/GID, thread count, and open file descriptors, respectively. System‑wide files like /proc/cpuinfo, /proc/meminfo, and /proc/modules provide a global view of hardware and kernel modules.

5.5 Kernel Parameter Tuning via /proc/sys

Writable entries under /proc/sys allow runtime adjustment of kernel parameters. For instance, echo 1 > /proc/sys/net/ipv4/ip_forward enables IP forwarding, while echo 10 > /proc/sys/vm/swappiness reduces the tendency to swap memory to disk. Changes may take effect immediately or require a reboot, and improper values can destabilize the system.

KernelOperating systemfile systeminodeVFSprocfs
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.