Fundamentals 6 min read

How to Compress and Decompress ZIP and TAR Files in Python

This guide explains how to use Python's built‑in zipfile and tarfile modules to compress individual files or entire directories into ZIP or TAR archives, traverse archive contents, and extract them, covering both plain TAR and compressed variants such as .tar.gz.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Compress and Decompress ZIP and TAR Files in Python

Compression and decompression are common tasks on both Windows (graphical) and Linux (command line), but implementing them programmatically can be less familiar; this article demonstrates how to perform these operations in Python.

1. ZIP file compression and decompression implementation

import os
import zipfile

# Compress a list of files and a list of directories into a zip archive
def my_zip_function(zip_file_name, zip_file_list=[], zip_dir_list=[]):
    # Use a context manager so the zip file is closed automatically
    with zipfile.ZipFile(zip_file_name, "w") as zip_obj:
        # Compress files
        for tmp_file in zip_file_list:
            zip_obj.write(tmp_file)
        # Compress directories (zipfile cannot add a directory directly, so walk it)
        for tmp_dir in zip_dir_list:
            for root, dirs, files in os.walk(tmp_dir):
                # Add the directory entry itself (comment out to skip empty dirs)
                zip_obj.write(root)
                for tmp_file in files:
                    # Build full file path; otherwise only the filename is used
                    tmp_file_path = os.path.join(root, tmp_file)
                    zip_obj.write(tmp_file_path)

# Traverse all entries in a zip archive
def my_traversal_zip_function(zip_file_name):
    with zipfile.ZipFile(zip_file_name, "r") as zip_obj:
        # Returns a list of ZipInfo objects
        all_file_list = zip_obj.infolist()
        for tmp_file in all_file_list:
            print(tmp_file.filename)
            # You can read file contents without extracting
            # if not tmp_file.is_dir():
            #     with zip_obj.open(tmp_file) as zip_fd:
            #         print(zip_fd.read())

# Extract a zip archive to a directory
def my_unzip_function(zip_file_name, path="."):
    with zipfile.ZipFile(zip_file_name, "r") as zip_obj:
        zip_obj.extractall(path=path)

if __name__ == "__main__":
    zip_file_name = "test_zip.zip"
    # Prepare test files and directories before running
    zip_file_list = ["test_tar_file1.txt", "test_tar_file2.txt"]
    zip_dir_list = ["test_tar_dir"]
    my_zip_function(zip_file_name, zip_file_list, zip_dir_list)
    my_traversal_zip_function(zip_file_name)
    # my_unzip_function(zip_file_name, path=".")

2. TAR file compression and decompression implementation

In addition to plain .tar files, formats such as .tar.gz, .tar.bz2, and .tar.xz are also supported.

import os
import tarfile

# Compress a list of files and directories into a tar archive
def my_tar_function(tar_file_name, tar_file_list=[], tar_dir_list=[], model="w"):
    # tarfile.open creates the archive; "w" creates a plain tar, other modes add compression
    with tarfile.open(tar_file_name, model) as tar_obj:
        # Add files
        for tmp_file in tar_file_list:
            tar_obj.add(tmp_file)
        # Add directories (tarfile can add a directory directly)
        for tmp_dir in tar_dir_list:
            tar_obj.add(tmp_dir)

# Traverse all entries in a tar archive
def my_traversal_tar_function(tar_file_name, model="r"):
    with tarfile.open(tar_file_name, model) as tar_obj:
        # Returns a list of TarInfo objects
        all_file_list = tar_obj.getmembers()
        for tmp_file in all_file_list:
            print(tmp_file.name)
            # You can read file contents without extracting
            # if not tmp_file.isdir():
            #     tar_fd = tar_obj.extractfile(tmp_file)
            #     print(tar_fd.read())

# Extract a tar archive to a directory
def my_untar_function(tar_file_name, path=".", model="r"):
    with tarfile.open(tar_file_name, model) as tar_obj:
        tar_obj.extractall(path=path)

if __name__ == "__main__":
    # Prepare test files and directories before running
    tar_file_list = ["test_tar_file1.txt", "test_tar_file2.txt"]
    tar_dir_list = ["test_tar_dir"]
    tar_file_name = "test_tar.tar"
    # tarfile also supports gz/bz2/xz compression by adding ":gz", ":bz2", etc. to the mode
    # For reading any format, use "r:*"
    open_model = "w"
    # open_model = "w:gz"
    my_tar_function(tar_file_name, tar_file_list, tar_dir_list, model=open_model)
    open_model = "r"
    # open_model = "r:*"
    my_traversal_tar_function(tar_file_name, model=open_model)
    # my_untar_function(tar_file_name, path=".", model=open_model)

Link: https://www.cnblogs.com/djdjdj123/p/18124124

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

compressionFile Compressionziptardecompression
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.