Fundamentals 7 min read

How to Automate Excel Data Processing with Python and openpyxl

This article walks through a real‑world Python automation task, showing how to read, filter, and update specific rows in an Excel workbook using openpyxl, complete with code examples and practical tips for extending the solution.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Automate Excel Data Processing with Python and openpyxl

Hello everyone, I am PiPi.

1. Introduction

In a Python enthusiasts group I was asked about automating an Excel‑based workflow. The goal is to transform raw data into a target format, as illustrated in the screenshots below.

The next image shows the original data (left) and the desired target data (right); the rows highlighted in yellow need processing.

2. Implementation Process

While some suggested using Excel pivot tables or Pandas, I opted for a pure Python solution with openpyxl because the dataset is relatively large and the logic is straightforward.

Below is the complete script that reads the workbook, evaluates each row, and writes the results into specific cells.

import openpyxl

# Open Excel file
workbook = openpyxl.load_workbook('测试.xlsx')
sheet = workbook.active

# Initialize IP address lists
ip_list1 = []
ip_list2 = []
ip_list3 = []

is_provided_misreport_list = []
is_provided_fixed_prove_list = []

# Iterate over rows starting from the second row
for row in range(2, sheet.max_row + 1):
    system_name = sheet.cell(row=row, column=1).value
    vulnerability_name = sheet.cell(row=row, column=2).value
    ip = sheet.cell(row=row, column=3).value
    is_provided_misreport = sheet.cell(row=row, column=4).value
    is_provided_fixed_prove = sheet.cell(row=row, column=5).value

    # Check for specific OpenSSH vulnerability and mis‑report flag
    if vulnerability_name == "OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)" and is_provided_misreport == '是':
        is_provided_misreport_list.append(is_provided_misreport)
        ip_list1.append(ip)

    if vulnerability_name == "OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)" and is_provided_fixed_prove == '是':
        is_provided_fixed_prove_list.append(is_provided_fixed_prove)
        ip_list2.append(ip)

    if vulnerability_name == "OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)" and is_provided_misreport == '否' and is_provided_fixed_prove == '否':
        ip_list3.append(ip)

    # Example of writing a boolean result for another vulnerability
    if vulnerability_name == "OpenSSH 'x11_open_helper()'函数安全限制绕过漏洞(CVE-2015-5352)" and is_provided_misreport == '否':
        sheet.cell(row=3, column=16).value = '是'
    else:
        sheet.cell(row=3, column=16).value = '否'

# Fill summary cells based on collected data
if '是' in is_provided_misreport_list:
    sheet.cell(row=15, column=3).value = '是'
else:
    sheet.cell(row=15, column=3).value = '否'

sheet.cell(row=15, column=4).value = ','.join(ip_list1)

if '是' in is_provided_fixed_prove_list:
    sheet.cell(row=15, column=5).value = '是'
else:
    sheet.cell(row=15, column=5).value = '否'

sheet.cell(row=15, column=6).value = ','.join(ip_list2)
sheet.cell(row=15, column=8).value = ','.join(ip_list3)

# Save the modified workbook
workbook.save('updated_excel_file.xlsx')

This script only processes the data in row 15; row 16 is left untouched. Encapsulating the logic into functions would make the solution more robust and easier to extend for additional vulnerability cases.

3. Conclusion

The example demonstrates how to solve a Python automation problem by directly manipulating Excel files with openpyxl. The provided code can be adapted to similar data‑processing tasks, and the approach encourages sharing reproducible snippets when seeking help.

When asking for assistance, remember to anonymize large datasets, provide a minimal reproducible example, and include error screenshots or the full script if it exceeds 50 lines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

openpyxl
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.