Backend Development 7 min read

Build a Reusable Python Batch Tool for Excel Automation

Learn how to create a modular Python script that batch processes Excel files—merging, cleaning, renaming, and more—using pandas and openpyxl, with a clear project structure and extensible design for real-world office automation.

Python Programming Learning Circle

Nov 19, 2025

Build a Reusable Python Batch Tool for Excel Automation

In response to growing office automation needs, this guide walks you through building a generic Python batch-processing script that can dramatically reduce repetitive work such as merging Excel files, bulk replacements, data splitting, watermarking, and report generation.

Project Goal

The script should:

Read all Excel files (xls/xlsx) in a specified folder.

Apply customizable processing logic to each file.

Write the processed results to a new output folder.

Maintain a clean, extensible, and reusable code structure.

Key Libraries

We rely on pandas for data manipulation and openpyxl for advanced Excel handling.

import os
import pandas as pd
from openpyxl import load_workbook

Project Structure

excel_batch_tool/
│── main.py            # entry point
│── processors.py      # processing functions
│── input/             # source Excel files
│── output/            # processed results

This modular layout makes future extensions straightforward.

Core Functionality: Traversing Excel Files

import os
import pandas as pd
from processors import process_excel

INPUT_DIR = "input"
OUTPUT_DIR = "output"

def ensure_output_folder():
    if not os.path.exists(OUTPUT_DIR):
        os.makedirs(OUTPUT_DIR)

def main():
    ensure_output_folder()
    for filename in os.listdir(INPUT_DIR):
        if filename.endswith((".xlsx", ".xls")):
            path = os.path.join(INPUT_DIR, filename)
            print(f"Processing: {filename}")
            df = pd.read_excel(path)
            df = process_excel(df)
            output_path = os.path.join(OUTPUT_DIR, filename)
            df.to_excel(output_path, index=False)
            print(f"Saved to: {output_path}")

if __name__ == "__main__":
    main()

The template can be reused by simply updating processors.py with new logic.

Custom Processing Logic (processors.py)

def process_excel(df):
    # 1. Drop fully empty rows
    df = df.dropna(how="all")
    # 2. Trim column names
    df.columns = [str(c).strip() for c in df.columns]
    # 3. Add a timestamp column
    import datetime
    df["处理时间"] = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    return df

You can extend this with batch replacements, formula insertion, multi‑sheet merging, conditional exports, statistical analysis, etc.

Example: Batch Replace Values in a Column

def process_excel(df):
    if "产品名称" in df.columns:
        df["产品名称"] = df["产品名称"].replace("旧名", "新名")
    return df

Example: Merge All Excel Files into One

all_data = []
for filename in os.listdir(INPUT_DIR):
    if filename.endswith((".xls", ".xlsx")):
        df = pd.read_excel(os.path.join(INPUT_DIR, filename))
        df["来源文件"] = filename
        all_data.append(df)
final_df = pd.concat(all_data, ignore_index=True)
final_df.to_excel("output/合并结果.xlsx", index=False)

Example: Split Files by a Category Column

def process_excel(df):
    for name, group in df.groupby("分类"):
        group.to_excel(f"output/{name}.xlsx", index=False)
    return None

Adjust main.py to write output only when a DataFrame is returned.

Conclusion

By completing this project you will master:

Flexible use of pandas and openpyxl .

Organizing batch‑processing scripts for scalability.

Practical function and module design.

Real‑world office automation implementation.

How to evolve a Python utility into a marketable product.

The same framework can be adapted into various tools such as merge utilities, splitters, replacers, watermark adders, and format converters, all ready for commercial use.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Batch Processing Pandas openpyxl Excel automation Project Structure backend scripting

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.