16 Practical Python Pandas Scripts for Excel File Processing
This article presents sixteen ready‑to‑use Python pandas examples that cover common Excel tasks such as reading, writing, sheet selection, cleaning, filtering, aggregation, styling, and exporting, providing a concise toolbox for data‑analysis automation.
Scenario 1: Read Excel file content
pip install pandas openpyxl import pandas as pd df = pd.read_excel('data.xlsx') print(df.head())# Output: display first rows of the data.
Scenario 2: Write DataFrame to Excel file
data = {'姓名': ['张三', '李四'], '分数': [90, 85]} df = pd.DataFrame(data) df.to_excel('output.xlsx', index=False) print("已写入 output.xlsx")# Output: confirmation that the file was written.
Scenario 3: Read a specific sheet
df = pd.read_excel('data.xlsx', sheet_name='销售数据') print(df.head())# Output: first rows of the "销售数据" sheet.
Scenario 4: Get all sheet names
xls = pd.ExcelFile('data.xlsx') print(xls.sheet_names)# Output: list of sheet names, e.g., ['Sheet1', '销售数据', '库存'].
Scenario 5: Filter rows where a column equals a specific value
df = pd.read_excel('data.xlsx') filtered = df[df['部门'] == '市场部'] print(filtered)# Output: records with "部门" equal to "市场部".
Scenario 6: Add a new column with calculations
df = pd.read_excel('scores.xlsx') df['总分'] = df['语文'] + df['数学'] + df['英语'] print(df.head())# Output: new column "总分" showing the sum for each row.
Scenario 7: Sort by a column
df = pd.read_excel('scores.xlsx') sorted_df = df.sort_values(by='总分', ascending=False) print(sorted_df)# Output: rows sorted descending by "总分".
Scenario 8: Drop rows with missing values
df = pd.read_excel('data.xlsx') cleaned = df.dropna() print(cleaned.shape)# Output: shape of the DataFrame after removing NaNs.
Scenario 9: Fill missing values
df = pd.read_excel('data.xlsx') df.fillna({'销售额': 0}, inplace=True) print(df.head())# Output: NaNs in "销售额" replaced with 0.
Scenario 10: Merge multiple Excel files with the same structure
import glob all_files = glob.glob("data_*.xlsx") dfs = [pd.read_excel(f) for f in all_files] combined_df = pd.concat(dfs, ignore_index=True) print(combined_df.shape)# Output: total rows and columns after merging.
Scenario 11: Export a subset based on a condition to a new file
df = pd.read_excel('data.xlsx') south_data = df[df['区域'] == '南方'] south_data.to_excel('南方数据.xlsx', index=False) print("南方数据已导出")# Output: confirmation of export.
Scenario 12: Count unique values in a column
df = pd.read_excel('orders.xlsx') unique_users = df['客户ID'].nunique() print(f"共有 {unique_users} 位客户")# Output: number of distinct customers.
Scenario 13: Group‑by aggregation
df = pd.read_excel('sales.xlsx') grouped = df.groupby('品类')['销售额'].sum() print(grouped)# Output: total sales per product category.
Scenario 14: Apply cell styles (bold header, red background)
from openpyxl import load_workbook from openpyxl.styles import Font, PatternFill wb = load_workbook('report.xlsx') ws = wb.active # Set header row bold for cell in ws[1]:
cell.font = Font(bold=True) # Set B2 background red ws['B2'].fill = PatternFill(start_color="FFC7CE", end_color="FFC7CE", fill_type="solid") wb.save('styled_report.xlsx') print("样式已应用并保存为 styled_report.xlsx")# Output: confirmation that styling was applied.
Scenario 15: Read Excel with custom header row and row limit
df = pd.read_excel('data.xlsx', header=3, nrows=100) print(df.head())# Output: first 100 rows starting from row 4.
Scenario 16: Export Excel data to an HTML table
df = pd.read_excel('data.xlsx') html_table = df.to_html(index=False) with open('table.html', 'w', encoding='utf-8') as f:
f.write(html_table) print("HTML 表格已保存为 table.html")# Output: confirmation that the HTML file was created.
Summary
These 16 Python scripts cover the most common Excel automation needs in daily office work, testing, and data analysis, including data read/write, cleaning, transformation, multi‑file merging and splitting, grouping, sorting, styling, and exporting, forming a handy "Excel automation toolbox" for boosting productivity.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.