Why Does `ls` Show 7.6 GB While `du` Shows Only 2.3 MB? Understanding Sparse Files
The article explains why a large log file appears to occupy several gigabytes with `ls` yet only a few megabytes with `du`, covering file deletion semantics, open file handles, the difference between apparent size and disk usage, and how sparse files cause this discrepancy.
File deletion and space reclamation
When a large log file (e.g., contentutil.log) is removed with rm, the disk space is not immediately released if a running process still holds an open file descriptor. The space is reclaimed only after the process exits or explicitly closes the descriptor (e.g., by lsof | grep deleted and terminating the process).
Apparent size vs. disk usage
ls -lahshows the file’s apparent size – the length stored in the inode. du reports the actual number of disk blocks allocated. On most Linux systems a block is 512 bytes (or 1 KB on some configurations), so even a 1‑byte file consumes at least one block.
Sparse files
A sparse file contains “holes” – unallocated regions that read as zeros. The logical length can be huge while the physical allocation is small, causing ls to display a size of several gigabytes and du to report only a few megabytes.
Reproducing a sparse file with dd
# dd of=sparse-file bs=1k seek=5120 count=0
0+0 records in
0+0 records out
0 bytes transferredThis creates a 5 MiB file (5120 × 1 KB) that occupies zero blocks because no data is written.
Creating a sparse file in C
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
int main() {
int fd = open("./filetest.log", O_RDWR|O_CREAT|O_EXCL, S_IRUSR|S_IWUSR);
if (fd < 0) { perror("open"); return -1; }
write(fd, "hello", strlen("hello")+1);
lseek(fd, 1024*1024*10, SEEK_END); // create a 10 MiB hole
write(fd, "hello", strlen("hello")+1);
close(fd);
return 0;
}The resulting file shows an apparent size of about 11 MiB (the logical length) but du reports only a few kilobytes, demonstrating a sparse file.
Inspecting the file
Viewing the file with vim ( :%!xxd) or od -c reveals long sequences of zero bytes between the written data, confirming the presence of holes.
Practical notes
Use lsof | grep deleted to locate deleted files that are still held open.
Truncating a file without restarting the process can be done with echo "" > contentutil.log or >contentutil.log, which frees the space while keeping the descriptor open.
Key takeaways
Deleting a file does not free space until all processes close the file descriptor. ls reports apparent size; du reports actual disk usage.
Sparse files allow very large logical sizes with minimal physical storage.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
