Unlocking Linux VFS: How the Virtual File System Powers All File Operations
This article provides a deep technical walkthrough of Linux's Virtual File System (VFS), explaining its design goals, core data structures, caching mechanisms, and step‑by‑step file operation flows—including mount, open, read, and write—while illustrating each concept with concrete code examples and real‑world scenarios.
Introduction to Linux VFS
The Virtual File System (VFS) is the abstraction layer that connects user‑space programs with underlying file systems such as ext4, XFS, NFS, procfs, and others. By presenting a uniform set of system calls (open, read, write, close, etc.), VFS lets applications operate on files without knowing the specifics of the underlying storage.
Core Functions of VFS
Unified Interface – VFS defines generic file‑operation functions. For example, open("/home/user/test.txt", O_RDONLY) follows the same path whether the file resides on ext4 or NFS, because VFS dispatches the request to the appropriate file‑system module.
Coordination Management – VFS tracks mounted file systems, allowing ext4, NFS, procfs, and sysfs to coexist. It handles mount/unmount operations and ensures each file system is integrated into a single namespace, much like a traffic controller directing different vehicle types on a road network.
Performance Optimisation – VFS caches metadata (inode cache) and directory entries (dentry cache). Repeated accesses to the same file can be satisfied from cache, avoiding costly disk I/O.
Four Core VFS Objects
Superblock (super_block)
A superblock represents a mounted file‑system instance. When mount /dev/sda1 /mnt/data is executed, the kernel creates a superblock in memory that stores global information such as file‑system type, block size, and mount options. It also holds a table of function pointers (super_operations) used for actions like allocating or destroying inodes.
#include <stdio.h>
#include <sys/statvfs.h>
int main(){
struct statvfs fs_info;
// Retrieve filesystem information for the current directory
statvfs(".", &fs_info);
printf("Block size: %lu bytes
", fs_info.f_bsize);
printf("Total blocks: %lu
", fs_info.f_blocks);
printf("Free blocks: %lu
", fs_info.f_bfree);
return 0;
}Inode
An inode is the file’s “identity card”. It stores metadata (type, permissions, owner UID/GID, size, timestamps) and pointers to data blocks. The kernel uses the inode number to locate a file quickly.
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main(){
struct stat inode_info;
stat("test.txt", &inode_info);
printf("inode number: %ld
", inode_info.st_ino);
printf("size: %ld bytes
", inode_info.st_size);
printf("permissions: %o
", inode_info.st_mode & 0777);
printf("owner UID: %d
", inode_info.st_uid);
return 0;
}Directory Entry (dentry)
A dentry links a pathname component to its inode. During path resolution, VFS walks the dentry tree, using the dentry cache to accelerate look‑ups.
#include <stdio.h>
#include <dirent.h>
int main(){
DIR *dir = opendir(".");
struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
printf("dentry: %-20s inode: %ld
", entry->d_name, entry->d_ino);
}
closedir(dir);
return 0;
}File Object (file)
When a process opens a file, VFS creates a file object that stores the current read/write position (f_pos), open mode, and a pointer to the associated inode. Multiple processes can have separate file objects for the same inode.
#include <stdio.h>
int main(){
// Open a file; kernel creates a file object
FILE *file = fopen("test.txt", "r+");
// Move the file pointer
fseek(file, 5, SEEK_SET);
printf("Current offset: %ld
", ftell(file));
fclose(file);
return 0;
}VFS Caching Mechanisms
Inode Cache
The inode cache stores recently accessed inodes. For example, running ls -l repeatedly on the same directory hits the inode cache after the first access, eliminating extra disk reads.
Buffer Cache
The buffer cache holds file data blocks. When a log file is read repeatedly, the first read loads the block into the buffer cache; subsequent reads retrieve the data directly from memory, dramatically speeding up analysis.
Complete File‑Operation Flow
Mount Process
Executing mount /dev/sda1 /mnt triggers sys_mount, which parses arguments and calls do_mount. do_mount locates the appropriate file_system_type (e.g., ext4), reads the superblock from the device, creates a super_block and a vfsmount object, and links the new dentry tree into the global namespace.
#include <stdio.h>
#include <sys/statvfs.h>
int main(){
struct statvfs st;
statvfs("/", &st);
printf("block size: %lu bytes
", st.f_bsize);
printf("total blocks: %lu
", st.f_blocks);
printf("free blocks: %lu
", st.f_bfree);
return 0;
}Open Process
The open system call enters the kernel via sys_open, which parses the pathname, walks the dentry tree, retrieves the target inode, allocates a file object, and returns a file descriptor to user space.
#include <stdio.h>
#include <fcntl.h>
int main(){
int fd = open("test.txt", O_RDWR | O_CREAT, 0644);
if (fd < 0) { perror("open failed"); return 1; }
printf("file opened, fd = %d
", fd);
close(fd);
return 0;
}Read/Write Process
For read, the kernel locates the file object, checks permissions, and calls the file‑system‑specific read method (e.g., ext4_file_read). If the data resides in the page cache, it is copied directly to user space; otherwise, the kernel fetches the block from disk first. write follows a symmetric path, writing into the page cache and later flushing dirty pages to storage.
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main(){
char buf[100] = "Hello VFS!";
int fd = open("test.txt", O_RDWR);
// Write: data goes to page cache, kernel flushes later
write(fd, buf, 10);
// Reset offset
lseek(fd, 0, SEEK_SET);
// Read: prefers page cache, falls back to disk
read(fd, buf, 10);
printf("Read content: %s
", buf);
close(fd);
return 0;
}Compatibility Across File‑System Types
When a specific file system (e.g., ext4, XFS) is loaded, it registers its operation tables with VFS. The superblock’s super_operations include functions like alloc_inode and read_inode. Inode operations such as lookup and create are provided by the file‑system module, allowing VFS to delegate work appropriately.
VFS also handles virtual file systems like procfs and sysfs, where data is generated on‑the‑fly rather than read from disk. For instance, reading /proc/cpuinfo triggers procfs code that assembles current CPU information and returns it to the caller.
Through this modular registration and request forwarding, VFS offers a seamless, uniform interface while letting each underlying file system exploit its own optimisations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
