How to Batch Extract Specific Rows from Excel with Python (Openpyxl Guide)
This article explains how to use Python and the openpyxl library to automatically locate rows where column A equals 10 (representing 10 am) across multiple Excel files, presents two practical solutions, and provides complete code snippets and screenshots for easy implementation.
Introduction
A fan asked how to extract rows from an Excel file where the value in column A equals 10 (the 10 am hour) and save them into a new file, a task useful for research and later visualization.
Idea
While Excel’s built‑in filter works for a single file, handling hundreds of files manually is impractical, so a Python script is a much more efficient solution.
Solution
The approach mirrors a previous article on batch Excel processing; the only change is using an equality check instead of a range. Below is the complete code.
# coding: utf-8
from openpyxl import load_workbook, Workbook
# Path to the folder containing the data files
path = 'C:/Users/pdcfi/Desktop/xiaoluo'
# Open the workbook
workbook = load_workbook(path + '/' + '巍图1.xlsx')
sheet = workbook.active
buy_mount = sheet['A']
row_lst = []
for cell in buy_mount:
if isinstance(cell.value, int) and cell.value == 10:
print(cell.row)
row_lst.append(cell.row)
new_workbook = Workbook()
new_sheet = new_workbook.active
# Create the same header as the original (first row)
header = sheet[1]
header_lst = []
for cell in header:
header_lst.append(cell.value)
new_sheet.append(header_lst)
# Copy rows that meet the condition into the new sheet
for row in row_lst:
data_lst = []
for cell in sheet[row]:
data_lst.append(cell.value)
new_sheet.append(data_lst)
# Save the new workbook
new_workbook.save(path + '/' + 'xiaoluo_符合筛选条件的新表.xlsx')Running the script produces a new Excel file containing only the rows that match the condition.
Note
The original data had column A referencing values from column B, so the cells appeared empty when read by Python. Two ways to fix this are:
Copy column A and paste as values, then save the file.
Use column B as the source, extract the hour, and create a new column for extraction.
Below is a pandas snippet for the second method:
df_raw['时间'] = pd.to_datetime(df_raw['时间'], format='%Y-%m-%d').hourBoth methods work, but the second is more intelligent albeit slightly more complex.
Conclusion
The article provides two practical solutions for extracting specific rows from Excel files with Python, offering complete code and tips to avoid common pitfalls such as hidden reference values.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
