Fundamentals 4 min read

Automate Bulk Excel Column Extraction and Merging with Python Pandas

This tutorial shows how to use Python's pandas library to automatically locate Excel files across many folders, extract specified columns, and merge the data into a single workbook, providing a practical solution for handling hundreds of spreadsheets efficiently.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Automate Bulk Excel Column Extraction and Merging with Python Pandas

1. Introduction

The task is to extract specific columns from many Excel files spread across multiple directories and combine them into a new workbook, which becomes cumbersome when dealing with hundreds of folders.

2. Import Libraries

Only the os module for file handling and pandas for data processing are required.

import pandas as pd
import os

3. Code Implementation

Define source folder and columns

# Path to the folder containing Excel files
path = "D:/a/"
# Columns to extract
key = ['A', 'B']
folders = os.listdir(path)
output_file = path + "result.xlsx"
writer = pd.ExcelWriter(output_file, engine='openpyxl')

Collect Excel file paths

file_names = []
for p in folders:
    if '.xl' in p:
        continue
    folder_path = path + p + "/"
    xlsx_names = [x for x in os.listdir(folder_path) if x.endswith('.xlsx')]
    for f in xlsx_names:
        file_names.append(folder_path + f)

Read, extract, and concatenate

df = None
for xlsx_name in file_names:
    df1 = pd.read_excel(xlsx_name, sheet_name=0, index_col=None, header=0)
    _df = df1.loc[:, key]
    if df is None:
        df = _df
    else:
        df = pd.concat([df, _df], ignore_index=True)
    print(xlsx_name + "  保存成功!共%d个,第%d个。" % (len(file_names), num))

4. Execution Result

Folder structure of files to be processed:

Code runs successfully:

Resulting Excel file saved:

5. Conclusion

The example demonstrates how pandas can efficiently batch‑extract and merge columns from numerous Excel workbooks, highlighting Python’s powerful data‑manipulation capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonExcelpandas
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.