Fundamentals 7 min read

How to Merge Multiple Excel Files and Preserve Headers with Python Pandas

This article walks through handling merged headers and formula cells in Excel using Python's pandas and openpyxl, providing step‑by‑step code to merge multiple workbooks, skip unwanted rows, assign custom headers, and format the final sheet with merged cells.

Python Crawling & Data Mining

Feb 23, 2024

How to Merge Multiple Excel Files and Preserve Headers with Python Pandas

1. Introduction

Hello, I am a Python enthusiast. In a recent group discussion I was asked how to handle two Excel issues: merged header cells that are not recognized, and cells containing formulas that are read as zero.

The following script merges multiple Excel files, concatenates their sheets, and saves the result to a single workbook.

import pandas as pd
import os

folder_path = r'C:/Users/mengxianqiao/merge_excel_files/测试数据'  # replace with actual folder
all_data = {}

for file_name in os.listdir(folder_path):
    if file_name.endswith('.xlsx'):
        file_path = os.path.join(folder_path, file_name)
        xls = pd.ExcelFile(file_path)
        for sheet_name in xls.sheet_names:
            if sheet_name not in all_data:
                all_data[sheet_name] = pd.DataFrame()
            header_rows = pd.read_excel(file_path, sheet_name=sheet_name, nrows=1).shape[0]
            sheet_data = pd.read_excel(file_path, sheet_name=sheet_name, skiprows=range(1, header_rows+1))
            all_data[sheet_name] = pd.concat([all_data[sheet_name], sheet_data], ignore_index=True)

output_csv = r"C:/Users/mengxianqiao/merge_excel_files/测试数据/汇总.xlsx"
with pd.ExcelWriter(output_csv, engine='openpyxl') as writer:
    for sheet_name, df in all_data.items():
        df.to_excel(writer, sheet_name=sheet_name, index=False)

print('Data has been successfully merged and saved to 汇总.xlsx.')

2. Implementation Details

Peers suggested skipping header rows when reading the files and then manually adding a unified header. The code below reads each Excel file, skips the first four rows, assigns a custom header, and drops empty rows.

import pandas as pd
import pathlib

folder = r"C:\Users\Desktop\民主评议表"
excel_files = pathlib.Path(folder).glob('*.xls')
header = ['姓名', '以学铸魂', '以学增智', '以学正风', '以学促干']

data = []
for i in excel_files:
    df = pd.read_excel(i, skiprows=4, header=None, index_col=0, usecols='A:F')
    df.dropna(inplace=True)
    df.columns = header

When using openpyxl, setting data_only=True returns the calculated values instead of the formulas.

Another approach merges cells in the output sheet to create a combined header.

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(1, 10, size=(20, 10)))
with pd.ExcelWriter('写入合并表头.xlsx', engine='openpyxl') as writer:
    book = writer.book
    sheet_name = '写入合并表头'
    df.to_excel(writer, sheet_name=sheet_name, index=False, startrow=1)
    sh = book[sheet_name]
    sh['A1'] = '表头合并'
    sh.merge_cells('A1:H1')

3. Conclusion

The article demonstrates how to solve common Excel data‑processing problems in Python by using pandas for reading and concatenating data and openpyxl for fine‑grained formatting such as merged headers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python data processing Excel openpyxl

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.