Fundamentals 10 min read

Why cp Can Finish Instantly: Understanding Inodes, Block Indexing and Sparse Files

The article explains how a Linux file can appear to be 100 GB yet copy in under a second because the file system uses inodes, direct and indirect block pointers, and sparse allocation, separating logical size from actual physical storage.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Why cp Can Finish Instantly: Understanding Inodes, Block Indexing and Sparse Files

A colleague observed that copying a 100 GB file with cp completed in less than a second, which seemed impossible on a SATA HDD.

Measurements show the file size reported by ls -lh is 100 GB, but du -sh ./test.txt reports only 2 MB, indicating a sparse file.

The stat ./test.txt output confirms the logical size (Size: 107374182400 bytes) while only a few blocks (Blocks: 4096) are allocated.

File‑system basics: data is stored in fixed‑size blocks (typically 4 KB). An inode holds metadata and an array of block pointers. The first 12 entries are direct pointers; the 13th, 14th, and 15th entries are single, double, and triple indirect pointers, enabling addressing up to roughly 4 TB.

In a sparse file the logical size recorded in the inode can far exceed the physical blocks allocated; unwritten regions consume no disk space, so copying such a file only copies the allocated blocks, making cp appear extremely fast.

Typical write flow: write data → allocate blocks → store block numbers in the inode (direct or indirect). Read flow: read inode → follow pointers to retrieve blocks.

Performance impact: small files using only direct pointers need two disk reads (inode + data block). Large files that require indirect indexing may need up to five reads (inode + up to three indirect blocks + data block).

Conclusion: the observed speed is due to the file being sparse; the file’s size attribute reflects logical length, while actual disk usage is determined by allocated blocks.

linuxfile systemInodeblock indexingsparse filecp command
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.