Why Does a .tar.gz File Have Two Extensions?
The article explains that the .tar.gz suffix reflects two separate Unix tools—tar for archiving and gzip for compression—combined via a pipeline, tracing their historical origins, design philosophy, and why this dual‑extension format remains prevalent today.
While recompiling nginx and extracting nginx-1.27.4.tar.gz with tar -xzvf, the author wondered why the file name contains two extensions.
The tar command first appeared in January 1979 in Unix Version 7 from AT&T Bell Labs. Its name stands for Tape ARchive, and it was designed to concatenate many small files into a single byte stream suitable for writing to magnetic tape. The format uses 512‑byte blocks, matching the disk sector size of the original Unix file system, and it performs no compression.
In October 1992, Jean‑loup Gailly and Mark Adler released gzip (version 0.1). It was created as a free alternative to the patented compress utility, using the DEFLATE algorithm. gzip only compresses a single byte stream; it does not understand directories.
Because each tool does one thing, they are combined with a Unix pipe: tar cf - mydir | gzip > mydir.tar.gz Here tar cf - mydir writes the archive to standard output ( -), the pipe ( |) feeds that stream to gzip, and the result is redirected to mydir.tar.gz. The double suffix therefore records the two processing steps.
GNU tar later added the -z option so that tar -xzvf automatically invokes gzip behind the scenes, but the compression is still performed by gzip, not by tar itself.
In contrast, the Windows world introduced formats like .zip (Phil Katz, 1989) and .rar (Eugene Roshal, 1993), which combine archiving and compression in a single executable. Later tools such as 7‑Zip ( .7z), bzip2, and xz followed the same all‑in‑one approach.
“Write programs to do one thing and do it well. Let programs work together. Prefer text streams as the universal interface.” – Doug McIlroy, Bell System Technical Journal, 1978.
McIlroy, who also invented the pipe symbol ( |), advocated the Unix philosophy of small, single‑purpose tools chained together. The .tar.gz file is a direct manifestation of that philosophy.
Today, most open‑source releases on GitHub—Linux kernel, nginx, curl, Redis—are still distributed as .tar.gz archives because the two‑step pipeline remains a robust, well‑understood method.
Thus, the two extensions are not arbitrary; they faithfully describe the two tools—tar for archiving and gzip for compression—that together produce the final file.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
