Automate Excel Data Extraction with Python: Step-by-Step Guide and Code
This article walks through a Python automation task that processes Excel files to extract vulnerability data, explains the required code using openpyxl, shows before-and-after results, and highlights limitations for larger datasets, inviting readers to explore a pandas-based solution next.
Introduction
In a recent Python automation question from a WeChat group, the author needed to process Excel data to extract vulnerability information and fill specific cells.
The original and target data are shown, with the rows to be operated highlighted.
Implementation
The solution uses the openpyxl library to iterate over rows, categorize IP addresses based on provided misreport and fix‑proof flags, and write results into designated columns.
import openpyxl
def fill_table(bug_name, sheet, row_number):
# 初始化IP地址列表,已提供误报证明ip
ip_list1 = []
# 初始化IP地址列表,已提供无法整改证明ip
ip_list2 = []
# 初始化IP地址列表,没有误报和无法整改证明的IP
ip_list3 = []
is_provided_misreport_list = []
is_provided_fixed_prove_list = []
# 遍历每一行
for row in range(2, sheet.max_row + 1):
system_name = sheet.cell(row=row, column=1).value
vulnerability_name = sheet.cell(row=row, column=2).value
ip = sheet.cell(row=row, column=3).value
is_provided_misreport = sheet.cell(row=row, column=4).value
is_provided_fixed_prove = sheet.cell(row=row, column=5).value
# 判断漏洞名称和是否提供误报证明,针对OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)漏洞
if vulnerability_name == bug_name and is_provided_misreport == '是':
is_provided_misreport_list.append(is_provided_misreport)
ip_list1.append(ip)
if vulnerability_name == bug_name and is_provided_fixed_prove == '是':
is_provided_fixed_prove_list.append(is_provided_fixed_prove)
ip_list2.append(ip)
if vulnerability_name == bug_name and is_provided_misreport == '否' and is_provided_fixed_prove == '否':
ip_list3.append(ip)
# 填写C15单元格
if '是' in is_provided_misreport_list:
sheet.cell(row=row_number, column=3).value = '是'
else:
sheet.cell(row=row_number, column=3).value = '否'
# 填写D15单元格
sheet.cell(row=row_number, column=4).value = ','.join(ip_list1)
# 填写E15单元格
if '是' in is_provided_fixed_prove_list:
sheet.cell(row=row_number, column=5).value = '是'
else:
sheet.cell(row=row_number, column=5).value = '否'
# 填写F15单元格
sheet.cell(row=row_number, column=6).value = ','.join(ip_list2)
# 填写G15单元格
if not ip_list3:
sheet.cell(row=row_number, column=7).value = "无"
else:
sheet.cell(row=row_number, column=7).value = ','.join(ip_list3)
if __name__ == '__main__':
# 打开Excel文件
workbook = openpyxl.load_workbook('测试.xlsx')
sheet = workbook.active
bug_name1 = "OpenSSH 'schnorr.c'远程内存破坏漏洞(CVE-2014-1692)"
bug_name2 = "OpenSSH 'x11_open_helper()'函数安全限制绕过漏洞(CVE-2015-5352)"
fill_table(bug_name1, sheet, row_number=15)
fill_table(bug_name2, sheet, row_number=16)
# 保存修改后的Excel文件
workbook.save('updated_excel_file.xlsx')Running the script produces the expected output, as illustrated below.
While the script works for small batches, it becomes cumbersome for larger datasets; a future article will demonstrate a pandas‑based implementation.
Conclusion
The tutorial provides a concrete Python automation example for Excel processing, including full code and explanations, helping readers solve similar tasks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
