Fundamentals 6 min read

How to Batch Extract Specific Rows from Excel with Python (Openpyxl Guide)

This article explains how to use Python and the openpyxl library to automatically locate rows where column A equals 10 (representing 10 am) across multiple Excel files, presents two practical solutions, and provides complete code snippets and screenshots for easy implementation.

Python Crawling & Data Mining

Oct 26, 2021

How to Batch Extract Specific Rows from Excel with Python (Openpyxl Guide)

Introduction

A fan asked how to extract rows from an Excel file where the value in column A equals 10 (the 10 am hour) and save them into a new file, a task useful for research and later visualization.

Idea

While Excel’s built‑in filter works for a single file, handling hundreds of files manually is impractical, so a Python script is a much more efficient solution.

Solution

The approach mirrors a previous article on batch Excel processing; the only change is using an equality check instead of a range. Below is the complete code.

# coding: utf-8
from openpyxl import load_workbook, Workbook

# Path to the folder containing the data files
path = 'C:/Users/pdcfi/Desktop/xiaoluo'

# Open the workbook
workbook = load_workbook(path + '/' + '巍图1.xlsx')
sheet = workbook.active
buy_mount = sheet['A']
row_lst = []
for cell in buy_mount:
    if isinstance(cell.value, int) and cell.value == 10:
        print(cell.row)
        row_lst.append(cell.row)

new_workbook = Workbook()
new_sheet = new_workbook.active

# Create the same header as the original (first row)
header = sheet[1]
header_lst = []
for cell in header:
    header_lst.append(cell.value)
new_sheet.append(header_lst)

# Copy rows that meet the condition into the new sheet
for row in row_lst:
    data_lst = []
    for cell in sheet[row]:
        data_lst.append(cell.value)
    new_sheet.append(data_lst)

# Save the new workbook
new_workbook.save(path + '/' + 'xiaoluo_符合筛选条件的新表.xlsx')

Running the script produces a new Excel file containing only the rows that match the condition.

Note

The original data had column A referencing values from column B, so the cells appeared empty when read by Python. Two ways to fix this are:

Copy column A and paste as values, then save the file.

Use column B as the source, extract the hour, and create a new column for extraction.

Below is a pandas snippet for the second method:

df_raw['时间'] = pd.to_datetime(df_raw['时间'], format='%Y-%m-%d').hour

Both methods work, but the second is more intelligent albeit slightly more complex.

Conclusion

The article provides two practical solutions for extracting specific rows from Excel files with Python, offering complete code and tips to avoid common pitfalls such as hidden reference values.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Automation Pandas openpyxl

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.