Understanding ext4 Extent: Data Structures and B+‑Tree Mechanism
This article explains the purpose, design, and internal data structures of ext4 extents, describes how the B+‑tree indexes extents for efficient mapping of logical to physical blocks, and compares ext4’s extent mechanism with older file‑system addressing methods and other modern file systems.
1. What is ext4 extent?
In the Linux kernel, the ext4 file system uses extents to replace traditional block mapping, grouping consecutive physical blocks into a single extent. This reduces fragmentation, improves space utilization, and speeds up read/write operations, especially for large files.
2. Background of ext4 extent
2.1 Addressing in ext2/ext3
Ext2 and ext3 rely on direct and indirect block pointers stored in the inode’s i_block array (12 direct, then single, double, and triple indirect). While sufficient for small files, this scheme inflates metadata size and I/O overhead for large files.
2.2 Drawbacks of traditional addressing
Indirect addressing wastes disk space, causes severe fragmentation, and requires many I/O operations to traverse multiple levels of indirect blocks, degrading performance for large files.
3. ext4 extent data structures
3.1 struct ext4_extent
struct ext4_extent {
__le32 ee_block; // starting logical block
__le16 ee_len; // number of contiguous physical blocks
__le16 ee_start_hi; // high 16 bits of starting physical block
__le32 ee_start_lo; // low 32 bits of starting physical block
};The fields map a range of logical blocks to a contiguous range of physical blocks.
3.2 struct ext4_extent_header
struct ext4_extent_header {
__le16 eh_magic; // magic number
__le16 eh_entries; // number of valid entries
__le16 eh_max; // maximum entries
__le16 eh_depth; // tree depth (0 for leaf)
__le32 eh_generation;// generation counter
};The header stores metadata for both leaf and index nodes of the B+‑tree.
3.3 struct ext4_extent_idx
struct ext4_extent_idx {
__le32 ei_block; // starting logical block of the subtree
__le32 ei_leaf_lo; // low 32 bits of leaf block address
__le16 ei_leaf_hi; // high 16 bits of leaf block address
__u16 ei_unused; // reserved
};Index entries point to child nodes in the B+‑tree.
4. Construction and operation of the ext4 extent B+‑tree
4.1 Basic concept of a B+‑tree
A B+‑tree keeps all leaf nodes at the same depth and links them in order, allowing fast range queries and efficient look‑ups for block mapping.
4.2 Node composition
Root nodes contain one header and several index entries; index nodes contain a header and a variable number of index entries; leaf nodes contain a header and a list of ext4_extent structures that hold the actual mappings.
4.3 Tree building process
When a file is created, an empty tree is allocated. As data is written, extents are first stored directly in the inode’s i_data array. Once that space is exhausted, extents are moved into a B+‑tree. Insertion may trigger node splits, and the tree height (eh_depth) is updated accordingly.
4.4 Searching for a physical block
To locate the physical block for a given logical block, the kernel walks the tree from root to leaf, selecting the index entry with the greatest starting logical block that does not exceed the target, then finally reads the matching ext4_extent to compute the exact physical address.
5. ext4 extent in file operations
5.1 File read path
ext4_readpage → page_readpages → ext4_get_block → _ext4_get_block → ext4_map_blocks → ext4_ext_map_blocks. The latter traverses the B+‑tree to find the extent covering the requested logical block and returns the corresponding physical block.
5.2 File write path
ext4_writepage calls ext4_get_block, which may allocate new blocks via ext4_mb_new_blocks, create a new ext4_extent, and insert it with ext4_ext_insert_extent, updating the tree structure as needed.
5.3 File expansion and shrinking
On expansion, new blocks are allocated, a new extent is created, and inserted into the tree. On shrinkage, the relevant extents are trimmed or removed, and the tree may be re‑balanced by merging under‑filled nodes.
6. Comparison with other file systems
6.1 vs. ext2/ext3
Ext2/3 use indirect block tables, leading to large metadata overhead and fragmentation. Ext4’s extent reduces metadata, improves sequential I/O, and minimizes fragmentation.
6.2 vs. modern file systems (XFS, Btrfs)
Ext4 offers strong sequential performance, good space utilization, and broad compatibility. XFS also uses extents and B+‑trees, while Btrfs adds advanced features but can be less stable in some workloads.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.