Fundamentals 37 min read

Unlocking Linux: Inside the Kernel, VFS, and File System Mechanics

This article provides a comprehensive overview of Linux internals, covering the kernel’s core components, memory and process management, the virtual file system layer, ext4 inode structures, caching strategies, direct I/O, and kernel parameter tuning for performance optimization.

Deepin Linux

Jul 6, 2025

Unlocking Linux: Inside the Kernel, VFS, and File System Mechanics

1. Linux Kernel

The kernel is the core of the operating system, responsible for managing processes, memory, device drivers, the file system, and networking, which together form the basic OS structure.

1.1 Memory Management

Linux employs virtual memory, dividing physical memory into 4 KB pages. It uses a slab allocator to manage these pages, supports swapping pages to disk, and provides mechanisms for allocating and freeing memory efficiently.

1.2 Process Management

Processes are execution entities of applications. Linux schedules multiple processes using time slices, employs a priority‑based scheduler, and provides inter‑process communication mechanisms such as signals, pipes, shared memory, semaphores, and sockets.

1.3 File System

Unlike DOS, Linux does not use drive letters; instead it builds a single hierarchical tree by mounting individual file systems at directories. The Virtual File System (VFS) abstracts the underlying file systems, offering a uniform API (open, read, write, close) to user space.

VFS separates logical file system implementations from device drivers, allowing support for dozens of file systems (ext2, ext3, ext4, FAT, VFAT, NTFS, etc.).

1.4 Device Drivers

Device drivers run in kernel space with high privileges, providing abstract interfaces for hardware interaction. Errors in drivers can crash the entire system.

1.5 Network Interface (NET)

The network stack supports BSD sockets and the full TCP/IP suite, with protocol and driver layers handling communication.

2. Linux Shell

The shell is the user interface and command interpreter, translating user commands into kernel calls. Common shells include bash, sh, csh, ksh, and zsh.

3. Linux System Files

3.1 File System Concepts

File systems organize data on storage devices using structures such as inodes, directory entries, and block groups. They support formatting, mounting, and treat everything as a file.

3.2 Virtual File System (VFS)

VFS provides a common abstraction layer, defining required interfaces and data structures so that different file systems can be accessed uniformly via system calls.

3.3 Unix File System

Key abstractions are files, directory entries, inodes, and mount points. Inodes store metadata (permissions, owner, size, timestamps) while directory entries map names to inodes.

3.4 File System Characteristics

Strict organization allowing block‑level storage.

Index areas for locating file blocks.

Cache layers for hot files.

Directory‑based organization for easy management.

Kernel‑maintained structures tracking open files.

3.5 EXT Series Formats

Ext4 introduces extents, a tree‑structured representation of contiguous blocks, reducing fragmentation and improving performance for large files.

struct ext4_inode {
    __le16  i_mode;
    __le16  i_uid;
    __le32  i_size_lo;
    __le32  i_atime;
    __le32  i_ctime;
    __le32  i_mtime;
    __le32  i_dtime;
    __le16  i_gid;
    __le16  i_links_count;
    __le32  i_blocks_lo;
    __le32  i_flags;
    ...
    __le32  i_block[EXT4_N_BLOCKS];
    __le32  i_generation;
    __le32  i_file_acl_lo;
    __le32  i_size_high;
    ...
};

Block allocation constants:

#define EXT4_NDIR_BLOCKS   12
#define EXT4_IND_BLOCK     EXT4_NDIR_BLOCKS
#define EXT4_DIND_BLOCK    (EXT4_IND_BLOCK + 1)
#define EXT4_TIND_BLOCK    (EXT4_DIND_BLOCK + 1)
#define EXT4_N_BLOCKS      (EXT4_TIND_BLOCK + 1)

Extent header and extent structures define the tree nodes used by ext4:

struct ext4_extent_header {
    __le16  eh_magic;
    __le16  eh_entries;
    __le16  eh_max;
    __le16  eh_depth;
    __le32  eh_generation;
};

struct ext4_extent {
    __le32  ee_block;
    __le16  ee_len;
    __le16  ee_start_hi;
    __le32  ee_start_lo;
};

struct ext4_extent_idx {
    __le32  ei_block;
    __le32  ei_leaf_lo;
    __le16  ei_leaf_hi;
    __u16   ei_unused;
};

3.6 Directory Storage Format

Directories are special files containing ext4_dir_entry records that map file names to inode numbers. When the EXT4_INDEX_FL flag is set, a hashed index tree speeds up lookups.

3.7 ext4 File System

Ext4 offers larger maximum file system size (1 EB) and file size (16 TB), journaling modes (journal, ordered, writeback), unlimited sub‑directories, built‑in encryption, compression, online checking, and defragmentation.

3.8 Btrfs File System

Btrfs provides transparent compression (zstd, lz4, zlib), copy‑on‑write, snapshots, RAID support, and scales up to 16 EB.

3.9 XFS File System

XFS is a high‑performance journaling file system supporting up to 16 EB, online resizing, delayed allocation, and efficient handling of large files.

4. Linux Page Cache

4.1 ext4 File Operations

const struct file_operations ext4_file_operations = {
    ...
    .read_iter  = ext4_file_read_iter,
    .write_iter = ext4_file_write_iter,
    ...
};

Read path calls generic_file_read_iter; write path calls __generic_file_write_iter, which distinguishes cached I/O from direct I/O.

4.2 Cached Write Path

ssize_t generic_perform_write(struct file *file,
                              struct iov_iter *i, loff_t pos)
{
    struct address_space *mapping = file->f_mapping;
    const struct address_space_operations *a_ops = mapping->a_ops;
    do {
        struct page *page;
        unsigned long offset, bytes;
        status = a_ops->write_begin(file, mapping, pos, bytes, flags,
                                    &page, &fsdata);
        copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
        flush_dcache_page(page);
        status = a_ops->write_end(file, mapping, pos, bytes, copied,
                                 page, fsdata);
        pos += copied;
        written += copied;
        balance_dirty_pages_ratelimited(mapping);
    } while (iov_iter_count(i));
    return written;
}

The write_begin step handles journaling (journal, ordered, writeback) and obtains a cache page via grab_cache_page_write_begin. Data is copied from user space with iov_iter_copy_from_user_atomic, then write_end marks the page dirty. balance_dirty_pages_ratelimited triggers background writeback when dirty pages exceed thresholds.

4.3 Cached Read Path

static ssize_t generic_file_buffered_read(struct kiocb *iocb,
                                          struct iov_iter *iter,
                                          ssize_t written)
{
    struct file *filp = iocb->ki_filp;
    struct address_space *mapping = filp->f_mapping;
    for (;;) {
        struct page *page = find_get_page(mapping, index);
        if (!page) {
            if (iocb->ki_flags & IOCB_NOWAIT)
                goto would_block;
            page_cache_sync_readahead(mapping, ra, filp, index,
                                      last_index - index);
            page = find_get_page(mapping, index);
            if (unlikely(page == NULL))
                goto no_cached_page;
        }
        if (PageReadahead(page))
            page_cache_async_readahead(mapping, ra, filp, page,
                                      index, last_index - index);
        ret = copy_page_to_iter(page, offset, nr, iter);
    }
    return ret;
}

The function first looks for a cached page; if missing, it performs synchronous readahead, then possibly asynchronous readahead, and finally copies data to user space.

5. Kernel Parameter Tuning

Kernel parameters are exposed via the /proc filesystem, allowing runtime adjustments to optimize performance, such as tuning dirty‑page thresholds, I/O scheduler settings, and memory management knobs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kernel Caching I/O Linux File System vfs

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.