Master Linux Backup with tar: Incremental, Split, and Restore Techniques
This guide explains how to use the powerful Linux tar command for full and incremental backups, including common options, exclusion patterns, splitting large archives, automated scheduling with cron, and reliable restoration of files and entire filesystems.
Overview
The tar utility, originally designed for tape archives, now serves as a versatile tool for backing up files and directories to any storage medium on Linux. It operates at the file level, supports incremental backups, and works independently of the underlying filesystem.
Common Options
-z, --gzip: Compress or decompress using gzip (suffix .gz). -c, --create: Create a new archive (default suffix .tar). -f, --file=: Specify the archive file name immediately after the option. -x, --extract: Extract files from an archive (inverse of -c). -p: Preserve original permissions and attributes. -g: Use a snapshot file for incremental backups. -C: Change to a directory before performing the operation. --exclude: Exclude files or directories matching a pattern (supports wildcards). -X, --exclude-from: Read exclusion patterns from a file. -t, --list: List archive contents (cannot be combined with -c or -x). -j, --bzip2: Use bzip2 compression (suffix .bz2). -P: Preserve absolute paths during extraction. -v: Verbose mode, showing processed files (useful for small archives).
Incremental Backup for Websites
For sites that generate static files daily, tar can create incremental backups using the -g option. It is recommended to run the backup from the directory that contains the files to keep paths relative.
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/data01.tar.gz .
# tar -zxpf /tmp/data01.tar.gz -C .The snapshot file records file attributes; subsequent runs only archive files that have changed since the last snapshot.
Comprehensive Example
Requirements:
Backup /tmp/data while excluding the cache directory and temporary files.
Split the archive into 1 GB chunks because the total size exceeds 4 GB.
Preserve all permissions, ownership, and attributes.
# cd /tmp/data
# rm -f /tmp/snapshot_data.snap
# tar -g /tmp/snapshot_data.snap -zcpf - --exclude=./cache ./ | \
split -b 1024M - /tmp/bak_data$(date -I).tar.gz_
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-07.tar.gz --exclude=./cache ./
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-08.tar.gz --exclude=./cache ./The split command creates files named bak_data2014-12-07.tar.gz_aa, ..._ab, ..._ac, etc.
Restore Process
# cat /tmp/bak_data2014-12-07.tar.gz_* | tar -zxpf - -C /tmp/data/
# tar -zxpf /tmp/bak_data2014-12-07.tar.gz -C /tmp/data/
# tar -zxpf /tmp/bak_data2014-12-08.tar.gz -C /tmp/data/When restoring, ensure archives are applied in chronological order; otherwise the final state may be inconsistent.
Automating with cron
Schedule regular backups (e.g., weekly full backup and daily incremental) by adding appropriate tar commands to a crontab file.
Full Filesystem Backup
To back up an entire Linux system, first create an exclusion list for directories that should not be archived (e.g., /proc, /sys, /dev, /tmp, etc.). Then run tar with the --exclude-from option.
# vi /backup/backup_tar_exclude.list
/backup
/proc
/lost+found
/sys
/mnt
/media
/dev
/tmp
# tar -zcpf /backup/backup_full.tar.gz -g /backup/tar_snapshot.snap \
--exclude-from=/backup/backup_tar_exclude.list /Important Considerations
Tar backups rely heavily on file atime attributes.
File ownership is determined by numeric UID; the same UID must exist on the target system.
Avoid running other processes during backup or restore to prevent data inconsistency.
Symbolic and hard links are restored correctly.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
