Unlocking Linux Ext2: From Superblock Basics to Inode Data Extraction
This article walks readers through Linux file‑system fundamentals, explains the role of the Virtual File System (VFS), dives deep into ext2 structures such as superblocks, group descriptors, block and inode bitmaps, and provides complete C code for reading file contents directly by inode number.
File System
Linux newcomers often feel overwhelmed by the concept of a file system, while seasoned users take it for granted. The article begins by clarifying that a file system is a set of rules for naming, organizing, and accessing files stored on media such as hard disks, SSDs, or USB drives.
Virtual File System (VFS)
Early Linux kernels tied each file‑system implementation directly to the OS, making it impossible to support multiple formats simultaneously. VFS abstracts these differences, presenting a uniform interface to user programs regardless of whether the underlying storage is ext3, FAT32, NTFS, or another type.
ext2 File System
Ext2 stores data in two logical parts: user data and metadata (the latter is called "metadata" or "metadata blocks"). Understanding block devices, block size, and how the mkfs.xx tools create file systems is essential. The article shows how to query block size with tune2fs -l /dev/sda1 and explains the meaning of each field.
# tune2fs -l /dev/sda1
Filesystem volume name: /boot
Filesystem magic number: 0xEF53
Block size: 1024
... (other fields omitted for brevity)Running the command on a real partition reveals the superblock’s contents, which the article later parses in detail.
Some Concepts
Superblock
The superblock records the file‑system type, block size, total block count, inode size, total inode count, and the number of groups. Because it is critical, multiple backup copies are stored in selected groups (1, 3, 5, 7, 9, 25, 27, 49, …) following the rule 3ⁿ, 5ⁿ, 7ⁿ.
Group Descriptors
Each group has a descriptor array defined in include/linux/ext2_fs.h:
struct ext2_group_desc {
__le32 bg_block_bitmap; /* first block of block bitmap */
__le32 bg_inode_bitmap; /* first block of inode bitmap */
__le32 bg_inode_table; /* first block of inode table */
__le16 bg_free_blocks_count;
__le16 bg_free_inodes_count;
__le16 bg_used_dirs_count;
__le16 bg_pad;
__le32 bg_reserved[3];
};The article provides a small program that reads and prints every descriptor, showing free block/inode counts and the locations of the bitmaps and inode tables.
#define B_LEN 32
int main(int argc, char **argv) {
char buf[B_LEN] = {0};
int i = 0, fd = -1;
struct ext2_group_desc gd;
if (-1 == (fd = open(argv[1], O_RDONLY, 0777))) {
printf("open file error!
");
return 1;
}
while (i < 64) {
if (-1 == read(fd, buf, B_LEN)) {
printf("read error!
");
close(fd);
return 1;
}
memcpy(&gd, buf, B_LEN);
printf("========== Group %d: ==========
", i);
printf("Blocks bitmap block %ld
", gd.bg_block_bitmap);
printf("Inodes bitmap block %ld
", gd.bg_inode_bitmap);
printf("Inodes table block %ld
", gd.bg_inode_table);
printf("Free blocks count %d
", gd.bg_free_blocks_count);
printf("Free inodes count %d
", gd.bg_free_inodes_count);
printf("Directories count %d
", gd.bg_used_dirs_count);
i++;
}
close(fd);
return 0;
}Block Bitmap
A block bitmap occupies one block; each bit indicates whether the corresponding block is free (0) or used (1). By examining the bitmap of Group 0, the article shows that blocks 0‑1027 are allocated, blocks 1028‑1031 are free, and the rest are free as well.
Inode Bitmap
Similarly, the inode bitmap marks used inodes. In Group 0 the first 11 inodes are allocated (the first 10 are reserved by ext2, the 11th is the lost+found directory). The bitmap occupies one block but only the first 2048 bytes are meaningful for the 16384 inodes in the group.
Inode Table
Each inode is 128 bytes; with 16384 inodes per group the table consumes 512 blocks. The inode structure contains size, timestamps, permission bits, and up to 15 block pointers (12 direct, 1 single‑indirect, 1 double‑indirect, 1 triple‑indirect). The article shows the layout and calculates the maximum file size for different block sizes.
File Access by Inode
To read a file when only its inode number is known, the article outlines the steps:
Read the superblock to obtain block size and s_inodes_per_group.
Compute the group number: group = inode_number / s_inodes_per_group.
Read the corresponding group descriptor to get the inode‑table start block.
Calculate the inode’s offset inside the table and read the 128‑byte inode structure.
Use the block pointers (direct and indirect) to fetch the file’s data, handling up to four levels of indirection.
The full C program ( rd_file_by_inode.c) implements these steps, including helper functions get_blk_size, get_grp_descriptor, get_inode, read_data, and get_data. It prints the inode number, block size, and file size, then writes the recovered data to a user‑specified output file.
#define EXT2_SB_SIZE 1024
struct fdata {
unsigned long inode_num; // user‑provided inode number
unsigned long i_blk_size; // block size from superblock
int i_grp_num; // group containing the inode
unsigned long i_nt_blk_num; // first block of the inode table
struct ext2_super_block sb;
struct ext2_group_desc gd;
struct ext2_inode i_data;
};
/* ... (functions get_blk_size, get_grp_descriptor, get_inode, read_data, get_data) ... */
int main(int argc, char **argv) {
if (argc != 4) {
printf("Usage: %s /dev/partition inode_number output_file
", argv[0]);
return 0;
}
int fd = open(argv[1], O_RDONLY, 0777);
if (fd == -1) { printf("open file error!
"); return 1; }
struct fdata mf_data;
mf_data.inode_num = atol(argv[2]);
if (!get_blk_size(fd, EXT2_SB_SIZE, &mf_data)) { printf("get superblock failed!
"); close(fd); return 1; }
get_grp_descriptor(fd, mf_data.i_blk_size, &mf_data);
get_inode(fd, 0, &mf_data);
get_data(fd, &mf_data, argv[3]);
printf("inode : %ld
", mf_data.inode_num);
printf("block size: %ld
", mf_data.i_blk_size);
printf("file size : %ld Byte(s)
", mf_data.i_data.i_size);
close(fd);
return 0;
}Compilation with gcc rd_file_by_inode.c -o rd_file_by_inode produces an executable that successfully extracts files such as bzImage, klinux-2.6.18.tar.gz, and VMwareTools-7.8.6-185404.i386.rpm from the ext2 partition. MD5 checksums of the original and recovered files match, confirming correctness.
Conclusion
The hands‑on exploration reinforces how superblocks, group descriptors, bitmaps, and inode tables cooperate to store and retrieve data on Linux file systems. Although real kernels use more optimized paths, reproducing the process in user space deepens understanding of file‑system internals and prepares readers for advanced storage engineering tasks.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
