Linux File Compression & Backup: Complete Guide to tar, gzip, bzip2, xz, zip, rsync, and scp
This tutorial walks through essential Linux commands for archiving, compressing, and transferring files—including tar, gzip, bzip2, xz, zip, rsync, and scp—explaining their options, performance trade‑offs, and practical usage scenarios such as daily backups, cross‑platform sharing, and incremental synchronization.
Introduction
The author shares a common ops nightmare: a missing backup after a power outage, and then demonstrates how a few command‑line tools can prevent data loss. The article covers the most useful Linux utilities for packing, compressing, and moving files.
tar: The Universal Archiver
tar creates archive files; it does not compress by default but can be combined with compression flags. A typical backup command is: tar -czvf backup_$(date +%Y%m%d).tar.gz /var/www/html/ Explanation of options: -c: create a new archive -z: compress with gzip (use -j for bzip2, -J for xz) -v: verbose output -f: specify the output file name (must be the last flag before the name)
To extract, replace -c with -x. Adding -C /target/path extracts to a specific directory. Mastering this command handles about 80 % of everyday archiving needs.
gzip vs bzip2 vs xz: Compression Rate Showdown
The article compares three common compressors: gzip: fastest, lower compression; ideal for log rotation with low CPU usage. bzip2: 10‑15 % smaller output than gzip but roughly twice as slow; often used on older systems. xz: best compression (about half the size of gzip) but high CPU and memory consumption; suitable when disk space is scarce or large images need transfer.
All three commands replace the original file (e.g., gzip file.txt creates file.txt.gz and removes file.txt). To keep the source, add -k or use a pipeline, for example:
tar cf - dir/ | xz > backup.tar.xzzip: Cross‑Platform Compatibility
Although less common on Linux, zip is universally supported on Windows, macOS, and mobile devices. A typical command to create a high‑compression, recursive archive and optionally encrypt it is:
zip -r -9 secret_backup.zip ./conf/ && zip -e secure.zip sensitive_data.txtKey options: -r: recurse into directories -9: maximum compression -e: prompt for encryption password
Extraction is done with unzip file.zip.
cpio & pax: Niche Archivers
These older tools are useful in specific scenarios. cpio works with standard input/output streams and is still used for kernel module installation and initramfs creation, e.g.:
find . -name "*.log" | cpio -ov > logs.cpio paxacts as a portable archiver that can read both tar and cpio formats, helpful when migrating data from legacy AIX or HP‑UX systems or when tar reports "invalid header" errors.
Remote Transfer & Synchronization: scp vs rsync
After compressing files, they need to be moved to a backup server. scp (Secure Copy) uses SSH for straightforward file transfer. Example: scp -P 2222 /opt/release.tar.gz [email protected]:/data/backups/ Adding -r copies directories and -C enables compression. However, scp restarts the transfer from the beginning after a disconnection, which is problematic for multi‑gigabyte files. rsync performs incremental synchronization, sending only changed parts. A typical command:
rsync -avz --progress -e "ssh -p 22" /data/web/ root@bak:/data/web_mirror/Options explained: -a: archive mode (preserves permissions, timestamps, etc.) --delete: delete files on the destination that no longer exist on the source --partial: keep partially transferred files to allow resume
After an initial full sync, a subsequent run that only changes a small file like config.ini completes in seconds, even for source directories of 100 GB.
Conclusion: Practical Tips for Reliable Backups
Key take‑aways:
Daily backups: tar -cjf or tar -cJf to balance speed and size.
Cross‑team file sharing: use zip for universal compatibility.
Scheduled backups: combine rsync -a with cron for automated mirroring.
Large file transfers: avoid scp; prefer rsync --partial or run the command inside tmux for resilience.
Core mantra: "Pack with tar, choose compression based on scenario, share with zip, sync incrementally with rsync, and use the right parameters for resume‑able transfers."
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Agent Super App
AI agent applications, installation, large-model testing, computer fundamentals, IT operations and maintenance exchange, network technology exchange, Linux learning
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
