Fundamentals 12 min read

Why cp Copies a 100GB File Instantly: Sparse Files & Inode Indexing Explained

A colleague was amazed when the cp command copied a 100 GB file in less than a second, prompting an investigation that reveals the difference between logical file size and physical block usage, the role of inodes, direct and indirect block indexing, and how sparse files make such copies appear instantaneous.

Java High-Performance Architecture

Oct 25, 2021

Why cp Copies a 100GB File Instantly: Sparse Files & Inode Indexing Explained

Why cp Appears So Fast

A colleague used cp to copy a 100 GB file and was surprised that the operation finished in under a second. ls -lh showed the file as 100 GB, but du -sh reported only 2 MB of actual disk usage.

sh-4.4# time cp ./test.txt ./test.txt.cp
real 0m0.107s
user 0m0.008s
sys 0m0.085s

A typical SATA hard drive writes at about 150 MB/s, so copying 100 GB should take around 11 minutes. The discrepancy led to deeper analysis.

Analyzing the File with stat

File: ./test.txt
Size: 107374182400   Blocks: 4096   IO Block: 4096 regular file
Device: 78h/120d   Inode: 3148347   Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/ root)   Gid: (0/ root)
Access: 2021-03-13 12:22:00.888871000 +0000
Modify: 2021-03-13 12:22:46.562243000 +0000
Change: 2021-03-13 12:22:46.562243000 +0000
Birth: -

The Size field (107374182400 bytes) reflects the logical file size, while the Blocks field (4096 × 512 B = 2 MB) shows the actual physical space allocated.

Key Points

Size is the logical size most users see.

Blocks represent the real disk space occupied.

File System Analogy

Think of a file system as a luggage storage service: you register a name (file name), receive a tag (metadata/index), and the storage room (disk) holds the physical items. The tag lets staff locate the luggage, just as an inode maps a file name to its data blocks.

Space Management in a File System

Storing data as a single contiguous chunk wastes space when files are sparse. Instead, the disk is divided into fixed‑size blocks (commonly 4 KB). A file’s inode contains pointers to the blocks that actually hold data.

Inode Structure and Multi‑Level Indexing

An inode typically holds 15 pointers:

First 12 pointers: direct indexes – each points directly to a data block (up to 48 KB).

13th pointer: single indirect – points to a block that contains further block numbers (adds up to 4 MB).

14th pointer: double indirect – adds another level, reaching about 4 GB.

15th pointer: triple indirect – adds a third level, reaching roughly 4 TB.

Thus a file system like ext2 can address up to ~4 TB using this hierarchy.

Why Sparse Files Copy Quickly

A sparse file has a large logical size but only a few blocks actually allocated. When cp copies such a file, it only reads and writes the allocated blocks, so the operation finishes rapidly despite the huge reported size.

In the example, the file’s logical size is 1 TB + 4 KB, but only two 4 KB blocks contain data, so the physical usage is merely 8 KB.

Conclusion

The speed of cp on seemingly huge files is explained by the distinction between logical size and physical block allocation, the inode’s role in mapping data, and the use of sparse files that allocate space only where data exists.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

File System inode block indexing sparse file cp command Storage Fundamentals

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.