Fundamentals 6 min read

How to Automate Excel Data Processing with Pandas: A Step-by-Step Guide

This article demonstrates how to use Python's pandas library to automate Excel data processing, showing code that groups, filters, merges, and formats vulnerability data, and provides downloadable results, while also offering tips for handling larger datasets and encouraging community interaction.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Automate Excel Data Processing with Pandas: A Step-by-Step Guide

Introduction

Hello, I am PiPi. Recently a member asked about automating office tasks with Python. Below is the original data and the desired target data.

The goal is to operate on the two highlighted rows.

Implementation Process

Previously, a solution using openpyxl worked for small batches, but handling dozens of vulnerabilities becomes cumbersome. This article presents a more efficient pandas implementation.

Here is the provided code:

# 筛选或条件
dfc1 = df.groupby(['系统名称', '漏洞名称', '是否提供误报证明']).agg({'ip': 'unique'}).rename(columns={'ip': '已提供误报证明ip'}).reset_index()
dfc2 = df.groupby(['系统名称', '漏洞名称', '是否提供无法整改证明']).agg({'ip': 'unique'}).rename(columns={'ip': '已提供无法整改证明ip'}).reset_index()
res = dfc1.merge(dfc2, how='outer', left_on=['系统名称', '漏洞名称', '是否提供误报证明'], right_on=['系统名称', '漏洞名称', '是否提供无法整改证明'])
res1 = res.loc[res['是否提供误报证明'].eq('是') & res['是否提供无法整改证明'].eq('是')].copy()
res1.set_index(['系统名称', '漏洞名称'], inplace=True)
# 筛选与条件
res2 = df[df['是否提供误报证明'].eq('否') & df['是否提供无法整改证明'].eq('否')].groupby(['系统名称', '漏洞名称']).agg({'ip': 'unique'}).rename(columns={'ip': '没有误报和无法整改证明的ip'})
# 结果合并
res = res1.join(res2, how='outer').fillna('')
# 将结果列表处理成字符串
ip_cols = res.columns[res.columns.str.contains('ip')]
res[ip_cols] = res[ip_cols].applymap(', '.join)
# 无ip的单元格用无填充
res[ip_cols] = res[ip_cols].where(res[ip_cols].ne(''), '无')
# 保存结果
res.to_excel('result.xlsx', index=False)

The script performs three filtering steps, merges the results, and outputs an Excel file with the expected outcome.

This resolves the user's issue for the two displayed vulnerabilities. For handling multiple vulnerabilities, a follow‑up article will be provided.

Conclusion

The article outlines a practical Python automation problem, presents a pandas‑based solution with full code, and demonstrates how to process and export the data efficiently.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Code Examplepandas
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.