Automate Bulk Excel Column Extraction and Merging with Python & Pandas
Learn how to use Python's pandas library to automatically scan multiple folders, extract specified columns from hundreds of Excel files, and merge them into a single workbook, complete with step-by-step code examples and visual results.
Introduction
Hello, I often need to extract specific columns from Excel files across many folders and merge them into a new Excel file. Handling a few folders is easy, but processing hundreds or thousands becomes overwhelming; this article shows how to automate the task with Python.
Import Libraries
Only the os library for file operations and pandas for data processing are required.
import pandas as pd
import osWrite Code
1. Define the root folder, columns, and output path
# Path to the root folder containing Excel files
path = "D:/a/"
# Columns to extract
key = ['A', 'B']
# List of subfolders
subfolders = os.listdir(path)
# Output merged file name
output_file = path + 'result.xlsx'
writer = pd.ExcelWriter(output_file, engine='openpyxl')2. Get a list of all Excel files to process
file_names = []
for sub in subfolders:
if '.xl' in sub:
continue
sub_path = path + sub + "/"
# Get all .xlsx files in the subfolder
xlsx_names = [f for f in os.listdir(sub_path) if f.endswith('.xlsx')]
for f in xlsx_names:
file_names.append(sub_path + f)3. Loop through each Excel, extract the specified columns, and merge
df = None
for xlsx_name in file_names:
df1 = pd.read_excel(xlsx_name, sheet_name=0, index_col=None, header=0)
_df = df1.loc[:, key]
if df is None:
df = _df
else:
df = pd.concat([df, _df], ignore_index=True)
print(xlsx_name + " 保存成功!共%d个,第%d个。" % (len(file_names), num))Execution Result
All folders to be processed are shown below:
The code runs successfully:
The merged result file is saved:
Content of the extracted result file:
Conclusion
This article demonstrated how to use pandas to batch‑extract and merge columns from multiple Excel files, showcasing the powerful data‑processing capabilities of Python. Happy coding!
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
