Understanding Linux Disk Formatting: mkfs, Inodes, Block Groups, and Filesystem Layout
This article explains how Linux formats disks using mkfs, detailing block size, inode and block counts, block groups, directory structures, and how to choose appropriate parameters for different workloads to achieve optimal storage efficiency and performance.
In previous articles we introduced the basic unit of a hard disk (the sector) and the concept of disk partitioning; after partitioning a disk must be formatted before the operating system can use it. This article discusses what Linux formatting actually does.
The Linux formatting command is mkfs . When running mkfs you must specify the partition and the filesystem type. The command divides the continuous disk space into manageable structures. An example execution on a test machine produced the following output:
# mkfs -t ext4 /dev/vdb
mke2fs 1.42.9 (28-Dec-2013)
文件系统标签=
OS type: Linux
块大小=4096 (log=2)
分块大小=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
6553600 inodes, 26214400 blocks
1310720 blocks (5.00%) reserved for the super user
第一个数据块=0
Maximum filesystem blocks=2174746624
800 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872The most important items in the output are:
Block size: 4096 bytes
Number of inodes: 6,553,600
Number of blocks: 26,214,400
Choosing a block size of 4096 bytes is suitable for many workloads, but it may be wasteful for very small files (<1 KB) because roughly one‑third of the space would be unused, and it may cause excessive inode‑to‑block indexing for very large files (GB‑scale) where many block pointers are needed.
Dividing the total block count by the inode count (26,214,400 / 6,553,600 ≈ 4) shows that, on average, one inode manages four blocks. Two extreme cases illustrate the problem:
If most files are ≤4 KB, the inode pool can be exhausted while a large portion of blocks remain free.
If most files are huge (e.g., each requires 1,000 blocks), the block pool can be exhausted while many inodes stay unused.
When the default mkfs parameters do not match your workload, you can use the more flexible mke2fs command, which allows detailed options. For example:
mke2fs -j -L "卷标" -b 2048 -i 8192 /dev/sdb12. Block groups
The formatting result also lists information about block groups. A block group is a collection of blocks and inodes that are managed together. Sample dumpe2fs output shows:
# dumpe2fs /dev/vdb
...
Block size: 4096
Inode size: 256
Inode count: 6553600
Block count: 26214400
...
Group 16: (Blocks 524288-557055) [INODE_UNINIT, ITABLE_ZEROED]
Checksum 0xe838, unused inodes 8192
Block bitmap at 524288 (+0), Inode bitmap at 524304 (+16)
Inode table at 524320-524831 (+32)
24544 free blocks, 8192 free inodes, 0 directories, 8192 unused inodes
可用块数: 532512-557055
可用inode数: 131073-139264
...
Group 799: (Blocks 26181632-26214399) [INODE_UNINIT, ITABLE_ZEROED]
...From this we learn that the partition contains 800 block groups, each group holds 32 K blocks, a block bitmap, an inode bitmap, and an inode table (about 612 blocks in the example). The remaining blocks are available for user data.
3. Directory layout
When a directory is created, the OS allocates a free inode from the inode bitmap and a block from the block bitmap. The block stores directory entries (e.g., ext4_dir_entry_2 ) that contain the file name and the inode number of each entry. The following diagram (Figure 2) illustrates the relationship between a directory's inode, its data block, and the entry structures.
Each directory block contains entries for the files and sub‑directories it holds, storing their names and inode numbers. Files follow the same principle: they consume an inode, and when data is written additional blocks are allocated.
4. Conclusion
A hard disk is a large array of sectors that cannot be used directly; it must go through three steps: partitioning, formatting, and mounting. Partitioning divides the sectors into large chunks, formatting converts those chunks into filesystem primitives such as inodes and blocks, and mounting makes the filesystem accessible via the mount command.
Raw partitions ("bare devices") can be used directly by applications like Oracle, but doing so bypasses the Linux filesystem layer and requires custom handling of raw blocks and inodes.
Images illustrating the formatted disk layout and other concepts are included above.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.