Fundamentals 6 min read

Automate Excel Workflows with Python: A Step‑by‑Step Tutorial

This article walks through a Python‑based solution for automating Excel tasks, explaining the problems in a fan's original script, presenting a corrected implementation with pandas and openpyxl, and offering an optimized version with detailed code snippets and explanations.

Python Crawling & Data Mining

Oct 10, 2023

Automate Excel Workflows with Python: A Step‑by‑Step Tutorial

1. Introduction

Hello, I’m PiPi. In a recent Python community discussion I was asked about automating office work with Python. The previous solution was too obscure for many readers, so this article revisits the fan’s code, points out its issues, and provides a clear, corrected implementation.

2. Implementation Process

The expert quickly identified two problems in the original script (see screenshot below).

The following code resolves the issues using pandas and openpyxl:

import pandas as pd
from shutil import copyfile

df = pd.read_excel("付款信息.xlsx")

df.insert(1, '占位符1', pd.NA)
df.insert(len(df.columns) - 1, '占位符2', pd.NA)
df['文件'] = df.index // 3

def to_excel(dataframe):
    with pd.ExcelWriter(filename, engine='openpyxl', mode='a', if_sheet_exists='overlay') as excel:
        sh = excel.book.active
        sheet_name = sh.title
        startrow = dataframe.name % 3 * 9 + 2
        # Unmerge merged cells first
        sh.unmerge_cells(f'B{startrow + 1}:G{startrow + 1}')
        # Write data
        dataframe.to_excel(excel, index=False, header=False, sheet_name=sheet_name, startcol=1, startrow=startrow)
        # Re‑merge cells
        sh.merge_cells(f'B{startrow + 1}:G{startrow + 1}')

for g, data in df.groupby('文件', as_index=False):
    filenum = data.pop('文件').iloc[0]
    filename = copyfile("付款申请单.xlsx", f'付款申请单_{filenum}新文件.xlsx')
    data.apply(to_excel, axis=1)

The solution works, but the code is dense. The expert added comments to improve readability.

Later, the expert further optimized the script, producing a more concise version:

ws.cell(3,2).value = v.iloc[0,0]
ws.cell(5,2).value = v.iloc[0,1]
ws.cell(6,2).value = v.iloc[0,2]
ws.cell(8,2).value = v.iloc[0,3]
if len(v) >= 2:
    ws.cell(12,2).value = v.iloc[1,0]
    ws.cell(14,2).value = v.iloc[1,1]
    ws.cell(15,2).value = v.iloc[1,2]
    ws.cell(17,2).value = v.iloc[1,3]
    if len(v) >= 3:
        ws.cell(21,2).value = v.iloc[2,0]
        ws.cell(23,2).value = v.iloc[2,1]
        ws.cell(24,2).value = v.iloc[2,2]
        ws.cell(26,2).value = v.iloc[2,3]

3. Conclusion

The article presented a concrete Python automation problem, explained the issues in the original code, and delivered a fully working solution with clear explanations and improved, well‑commented code. Thanks to the community members who contributed ideas and code.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Automation Scripting Pandas openpyxl

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.