Fundamentals 6 min read

Automate Excel Data Extraction with Python: Step-by-Step Guide and Code

This article walks through a Python automation task that processes Excel files to extract vulnerability data, explains the required code using openpyxl, shows before-and-after results, and highlights limitations for larger datasets, inviting readers to explore a pandas-based solution next.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Automate Excel Data Extraction with Python: Step-by-Step Guide and Code

Introduction

In a recent Python automation question from a WeChat group, the author needed to process Excel data to extract vulnerability information and fill specific cells.

The original and target data are shown, with the rows to be operated highlighted.

Implementation

The solution uses the openpyxl library to iterate over rows, categorize IP addresses based on provided misreport and fix‑proof flags, and write results into designated columns.

import openpyxl

def fill_table(bug_name, sheet, row_number):
    # 初始化IP地址列表,已提供误报证明ip
    ip_list1 = []
    # 初始化IP地址列表,已提供无法整改证明ip
    ip_list2 = []
    # 初始化IP地址列表,没有误报和无法整改证明的IP
    ip_list3 = []

    is_provided_misreport_list = []
    is_provided_fixed_prove_list = []

    # 遍历每一行
    for row in range(2, sheet.max_row + 1):
        system_name = sheet.cell(row=row, column=1).value
        vulnerability_name = sheet.cell(row=row, column=2).value
        ip = sheet.cell(row=row, column=3).value
        is_provided_misreport = sheet.cell(row=row, column=4).value
        is_provided_fixed_prove = sheet.cell(row=row, column=5).value

        # 判断漏洞名称和是否提供误报证明,针对OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)漏洞
        if vulnerability_name == bug_name and is_provided_misreport == '是':
            is_provided_misreport_list.append(is_provided_misreport)
            ip_list1.append(ip)

        if vulnerability_name == bug_name and is_provided_fixed_prove == '是':
            is_provided_fixed_prove_list.append(is_provided_fixed_prove)
            ip_list2.append(ip)

        if vulnerability_name == bug_name and is_provided_misreport == '否' and is_provided_fixed_prove == '否':
            ip_list3.append(ip)

    # 填写C15单元格
    if '是' in is_provided_misreport_list:
        sheet.cell(row=row_number, column=3).value = '是'
    else:
        sheet.cell(row=row_number, column=3).value = '否'

    # 填写D15单元格
    sheet.cell(row=row_number, column=4).value = ','.join(ip_list1)

    # 填写E15单元格
    if '是' in is_provided_fixed_prove_list:
        sheet.cell(row=row_number, column=5).value = '是'
    else:
        sheet.cell(row=row_number, column=5).value = '否'

    # 填写F15单元格
    sheet.cell(row=row_number, column=6).value = ','.join(ip_list2)

    # 填写G15单元格
    if not ip_list3:
        sheet.cell(row=row_number, column=7).value = "无"
    else:
        sheet.cell(row=row_number, column=7).value = ','.join(ip_list3)

if __name__ == '__main__':
    # 打开Excel文件
    workbook = openpyxl.load_workbook('测试.xlsx')
    sheet = workbook.active
    bug_name1 = "OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)"
    bug_name2 = "OpenSSH 'x11_open_helper()'函数安全限制绕过漏洞(CVE-2015-5352)"
    fill_table(bug_name1, sheet, row_number=15)
    fill_table(bug_name2, sheet, row_number=16)
    # 保存修改后的Excel文件
    workbook.save('updated_excel_file.xlsx')

Running the script produces the expected output, as illustrated below.

While the script works for small batches, it becomes cumbersome for larger datasets; a future article will demonstrate a pandas‑based implementation.

Conclusion

The tutorial provides a concrete Python automation example for Excel processing, including full code and explanations, helping readers solve similar tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Tutorialopenpyxl
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.