Operations 7 min read

How to Automate Excel Archive Lookups with Python and openpyxl

This tutorial walks through a real‑world Python office‑automation task: extracting rows for a specific archive name from dozens of Excel worksheets and consolidating them into a new workbook, complete with step‑by‑step analysis and full code examples.

Python Crawling & Data Mining

Aug 24, 2020

How to Automate Excel Archive Lookups with Python and openpyxl

Hello everyone, it's time for the Python office‑automation series. This article presents a real‑world automation requirement: extracting rows for a specific archive name from multiple sheets in an Excel workbook and consolidating them into a new sheet.

1. Requirement Description

The source workbook contains a summary sheet with daily worksheets; each sheet holds many archive records, and new archives may be added over time. The task is to generate a new worksheet for a given archive name that aggregates all its rows across every date.

The expected result is a single sheet containing all rows for the specified archive (e.g., "档案x003").

2. Step Analysis

Before coding, break the problem into clear steps:

Create a new Excel workbook. Copy the header row from the source workbook (columns: 名称, 配置, 提交日期, etc.). Iterate over each worksheet in the source workbook. In each sheet, scan column A for the target archive name. When a match is found, record the row index, collect all cell values from that row, replace empty cells with blanks, and append the row to the new worksheet.

3. Python Implementation

Import the required library: from openpyxl import load_workbook, Workbook Load the existing workbook and create a new one with the same header:

filepath = r'C:\Users\chenx\Desktop\台账.xlsm'  # adjust the path as needed
workbook = load_workbook(filepath)
new_workbook = Workbook()
new_sheet = new_workbook.active
new_headers = ['名称', '配置', '提交日期', '受限操作', '操作时间', '状态', '存储位置']
new_sheet.append(new_headers)

Iterate through each sheet and extract matching rows:

keyword = '档案x003'  # example archive name
for sheet_name in workbook.sheetnames:
    sheet = workbook[sheet_name]
    names = sheet['A']
    flag = 0
    for cell in names:
        if cell.value == keyword:
            flag = cell.row
            break
    if flag:
        data_lst = []
        for cell in sheet[flag]:
            if cell.value:
                data_lst.append(str(cell.value))
            else:
                data_lst.append(' ')
        new_sheet.append(data_lst)

Save the consolidated workbook:

new_workbook.save(r'C:\Users\chenx\Desktop\台账查询.xlsx')

4. Summary

This example demonstrates how Python and the openpyxl library can automate the extraction of specific rows from dozens of Excel worksheets, turning a repetitive manual task into a concise script and freeing the user from tedious data‑gathering work.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scripting openpyxl data-processing

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.