Fundamentals 12 min read

Master Excel Automation with Python: 20 Practical Pandas & Openpyxl Techniques

This step‑by‑step guide shows how to generate a sample Excel file and then use pandas and openpyxl to read, write, update, filter, sort, group, format, chart, and finally save Excel workbooks, covering twenty essential operations for data automation.

Test Development Learning Exchange
Test Development Learning Exchange
Test Development Learning Exchange
Master Excel Automation with Python: 20 Practical Pandas & Openpyxl Techniques

1. Create Sample Data (bank_data.xlsx)

Generate an example Excel file with customer information, loan details, and balances using pandas.

import pandas as pd
# Create sample data
data = {
    '客户ID': [1, 2, 3, 4, 5],
    '姓名': ['张三', '李四', '王五', '赵六', '孙七'],
    '联系方式': ['13800000000', '13900000000', '13700000000', '13600000000', '13500000000'],
    '账户余额': [10000.0, 20000.0, 15000.0, 30000.0, 25000.0],
    '贷款类型': ['信用贷款', '房贷', '信用贷款', '车贷', '信用贷款'],
    '贷款金额': [50000.0, 300000.0, 60000.0, 100000.0, 70000.0],
    '利率': [5.0, 4.5, 5.2, 4.8, 5.1],
    '贷款期限(年)': [3, 20, 4, 5, 3]
}

df = pd.DataFrame(data)
df.to_excel('bank_data.xlsx', index=False)

2. Read Excel File

Load the workbook into a DataFrame for further processing.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
print(df.head())

3. Write Excel File

Save processed data to a new file.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
df.to_excel('processed_bank_data.xlsx', index=False)

4. Update Specific Cell

Modify a particular value, e.g., the balance of the first customer.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
df.at[0, '账户余额'] = 12000.0
df.to_excel('updated_bank_data.xlsx', index=False)

5. Add a New Worksheet

Use openpyxl to create an additional sheet in the existing workbook.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
ws = wb.create_sheet(title="新工作表")
wb.save('bank_data_with_new_sheet.xlsx')

6. Delete a Worksheet

Remove a specified sheet if it exists.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
if '新工作表' in wb.sheetnames:
    del wb['新工作表']
wb.save('bank_data_deleted_sheet.xlsx')

7. Copy a Worksheet

Duplicate an existing sheet within the workbook.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
source = wb['Sheet1']
target = wb.copy_worksheet(source)
target.title = "复制的工作表"
wb.save('bank_data_copied_sheet.xlsx')

8. Rename a Worksheet

Change the title of a sheet.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
sheet = wb['Sheet1']
sheet.title = "重命名的工作表"
wb.save('bank_data_renamed_sheet.xlsx')

9. Find Specific Values

Locate rows that match a given criterion using pandas.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
result = df[df['姓名'] == '张三']
print(result)

10. Filter Data

Select rows where the loan amount exceeds 50,000.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
filtered_df = df[df['贷款金额'] > 50000]
print(filtered_df)

11. Sort Data

Order the DataFrame by account balance in descending order.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
sorted_df = df.sort_values(by='账户余额', ascending=False)
print(sorted_df)

12. Group and Summarize

Aggregate loan amounts by loan type.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
grouped_df = df.groupby('贷款类型')['贷款金额'].sum()
print(grouped_df)

13. Merge Cells

Combine a range of cells into one using openpyxl.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
ws = wb.active
ws.merge_cells('A1:C1')
wb.save('bank_data_merged_cells.xlsx')

14. Set Cell Formatting

Apply font style and alignment to a cell.

from openpyxl import load_workbook
from openpyxl.styles import Font, Alignment
wb = load_workbook('bank_data.xlsx')
ws = wb.active
cell = ws['A1']
cell.font = Font(bold=True, color="FF0000")
cell.alignment = Alignment(horizontal='center', vertical='center')
wb.save('bank_data_formatted_cell.xlsx')

15. Insert a Chart

Create a bar chart of account balances and embed it in the sheet.

import pandas as pd
from openpyxl import load_workbook
from openpyxl.chart import BarChart, Reference

df = pd.read_excel('bank_data.xlsx')
df.to_excel('temp_bank_data.xlsx', index=False)
wb = load_workbook('temp_bank_data.xlsx')
ws = wb.active
chart = BarChart()
data = Reference(ws, min_col=4, min_row=1, max_row=len(df)+1, max_col=4)
categories = Reference(ws, min_col=1, min_row=2, max_row=len(df)+1)
chart.add_data(data, titles_from_data=True)
chart.set_categories(categories)
chart.title = "账户余额柱状图"
ws.add_chart(chart, "F1")
wb.save('bank_data_with_chart.xlsx')

16. Calculate Statistics

Compute total and average account balances.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
total_balance = df['账户余额'].sum()
average_balance = df['账户余额'].mean()
print(f"账户余额总和: {total_balance}")
print(f"账户余额平均值: {average_balance}")

17. Apply Conditional Formatting

Highlight cells with balances below 15,000 in red.

from openpyxl import load_workbook
from openpyxl.formatting.rule import CellIsRule
from openpyxl.styles import PatternFill
wb = load_workbook('bank_data.xlsx')
ws = wb.active
red_fill = PatternFill(start_color="FF0000", end_color="FF0000", fill_type="solid")
rule = CellIsRule(operator='lessThan', formula=['15000'], fill=red_fill)
ws.conditional_formatting.add('D2:D6', rule)
wb.save('bank_data_conditional_format.xlsx')

18. Unmerge Cells

Split previously merged cells back into individual cells.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
ws = wb.active
ws.unmerge_cells('A1:C1')
wb.save('bank_data_unmerged_cells.xlsx')

19. Clear Cell Content or Style

Remove values and formatting from a specific cell.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
ws = wb.active
ws['A1'].value = None
ws['A1'].font = None
ws['A1'].fill = None
ws['A1'].border = None
ws['A1'].alignment = None
ws['A1'].number_format = None
ws['A1'].protection = None
wb.save('bank_data_cleared_content_and_style.xlsx')

20. Auto‑Adjust Column Widths

Iterate over columns to set optimal widths based on cell content.

from openpyxl import load_workbook
wb = load_workbook('bank_data.xlsx')
ws = wb.active
for col in ws.columns:
    max_length = 0
    column = col[0].column_letter
    for cell in col:
        try:
            if len(str(cell.value)) > max_length:
                max_length = len(cell.value)
        except:
            pass
    adjusted_width = max_length + 2
    ws.column_dimensions[column].width = adjusted_width
wb.save('bank_data_auto_adjusted_columns.xlsx')

21. Final Save

Write the fully processed DataFrame back to an Excel file.

import pandas as pd
df = pd.read_excel('bank_data.xlsx')
df.to_excel('final_processed_bank_data.xlsx', index=False)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonautomationdata-processingTutorialExcelpandasopenpyxl
Test Development Learning Exchange
Written by

Test Development Learning Exchange

Test Development Learning Exchange

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.