Effortlessly Merge Multiple Excel Files and Sheets with Python: 4 Proven Methods
This article explains how to use Python's pandas, os, and glob libraries to automatically locate, read, and concatenate all sheets from multiple Excel workbooks across nested folders, offering four concise code solutions, visual results, and practical tips for reliable batch merging.
1. Introduction
Hello, I am a Python enthusiast. Previously I shared tutorials on splitting Excel into CSV files and merging multiple Excel files with multiple sheets. A community member needed to merge many Excel files, each containing several sheets, which prompted this guide.
2. Project Goal
Use Python to batch‑merge all sheets from every Excel file located in all sub‑folders of a given directory, a common real‑world need.
3. Preparation
Software: PyCharm
Required libraries: pandas, os, glob
4. Project Analysis
1) Selecting Excel files
Use os and glob to collect all target Excel files.
2) Selecting Sheets
Read sheet names with pandas.
3) Merging
Iterate over each sheet and concatenate data using pandas.concat().
4) Saving
Write the combined DataFrame to a new Excel file with to_excel.
5. Implementation
Method 1
Detailed script from Wang Ning (includes time measurement and error handling).
# -*- coding: utf-8 -*-
import pandas as pd
import datetime
import os
start = datetime.datetime.now()
def Set_Work_Path(x):
try:
os.chdir(x)
route = os.getcwd()
print(route)
return route
except Exception:
print("No Result")
work_path = r"E:\PythonCrawler\python_crawler-master\MergeExcelSheet\file\"
Set_Work_Path(work_path)
def Get_Dedicated_4Letter_File_List(x):
path = os.getcwd()
old_name = path + os.sep + "汇总数据" + ".xlsx"
if os.path.exists(old_name):
os.remove(old_name)
files = os.listdir(path)
current_list = []
for i in range(0, len(files), 1):
try:
if files[i][-4:] == x and files[i][:4] != "汇总数据":
current_list.append(files[i])
except Exception:
pass
return current_list
Current_Excel_list = Get_Dedicated_4Letter_File_List("xlsx")
print(Current_Excel_list)
def Get_All_Sheets_Excel(x):
file = pd.ExcelFile(x)
list_sht_name = file.sheet_names
print(list_sht_name)
list_sht_data = []
for i in range(0, len(list_sht_name), 1):
list_sht_data.append(pd.read_excel(x, header=0, sheet_name=list_sht_name[i], index_col=None))
df = pd.concat(list_sht_data)
df.dropna(axis=0, how="all", inplace=True)
print(df)
return df
data_list = []
for i in range(0, len(Current_Excel_list), 1):
data_list.append(Get_All_Sheets_Excel(Current_Excel_list[i]))
data = pd.concat(data_list)
data.dropna(axis=0, how="all", inplace=True)
print(data)
writer = pd.ExcelWriter("王宁大佬的汇总数据.xlsx")
data.to_excel(writer, encoding="utf_8_sig", sheet_name="DATA", index=False)
writer.save()
end = datetime.datetime.now()
run_time = round((end-start).total_seconds()/60, 2)
show = "程序运行消耗时间为: %s 分钟" % run_time+",搞定!"
print(show)This approach requires all source files to have identical structures and column names.
Method 2
Uses sheet_name=None to read all sheets at once and extend() to collect data.
# -*- coding: utf-8 -*-
import os
import pandas as pd
result = []
path = r"E:\PythonCrawler\python_crawler-master\MergeExcelSheet\testfile\file"
for root, dirs, files in os.walk(path, topdown=False):
for name in files:
if name.endswith(".xls") or name.endswith(".xlsx"):
df = pd.read_excel(os.path.join(root, name), sheet_name=None)
result.append(df)
data_list = []
for data in result:
data_list.extend(data.values()) # use extend, not append
df = pd.concat(data_list)
df.to_excel("testfile所有表合并.xlsx", index=False)
print("合并完成!")Method 3
Recursively walks directories with glob and appends each sheet DataFrame to a list.
# -*- coding: utf-8 -*-
import glob
import pandas as pd
path = "E:\\PythonCrawler\\python_crawler-master\\MergeExcelSheet\\file\\"
data = []
for excel_file in glob.glob(f'{path}/**/[!~]*.xls*', recursive=True):
excel = pd.ExcelFile(excel_file)
for sheet_name in excel.sheet_names:
df = excel.parse(sheet_name)
data.append(df)
df = pd.concat(data, ignore_index=True)
df.to_excel("小小明提供的代码(合并多表)--glob和pandas库列表append方法--所有表合并.xlsx", index=False)
print("合并完成!")Method 4
Similar to Method 2 but uses extend() directly on the dictionary values returned by read_excel(..., sheet_name=None).
# -*- coding: utf-8 -*-
import glob
import pandas as pd
path = r"E:\PythonCrawler\python_crawler-master\MergeExcelSheet\file"
data = []
for excel_file in glob.glob(f'{path}/[!~]*.xlsx'):
dfs = pd.read_excel(excel_file, sheet_name=None).values()
data.extend(dfs)
df = pd.concat(data, ignore_index=True)
df.to_excel("小小明提供的代码(合并多表)--glob和pandas库列表extend方法--简洁--所有表合并.xlsx", index=False)
print("合并完成!")6. Result Demonstration
1) Original Excel data:
2) Progress indicator:
3) Merged result:
7. Conclusion
Starting from a practical work scenario, this article presented four Python‑based methods to batch‑merge all sheets from Excel files in nested folders, saving time and reducing manual errors. The code is concise, and readers are encouraged to ask questions or extend the scripts, even packaging them as executable tools.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
