Automate Excel Reporting with Python: A Step‑by‑Step Guide
This article walks you through automating daily loan reporting for a bank using Python and pandas, covering data loading, date filtering, table splitting, column renaming, concatenation, missing‑value handling, calculated fields, and pivot‑table generation to replace repetitive Excel tasks.
1. Case Scenario
As a data analyst in a bank, you need to produce daily, weekly, monthly loan reports. Manually creating these reports in Excel is time‑consuming and error‑prone, so automating the workflow with Python frees you to focus on business insights.
2. Excel Creation Process
Using Excel’s PivotTable feature, you would place the contract effective date in the filter area (to select the current year), the purpose field in the column area, and the unit field in the row area, then aggregate loan amounts.
Because each loan is split among three branches, you must also calculate split amounts before the PivotTable.
3. Python Optimizes Report Production
3.1 Load Data
Read the source Excel file with pandas, selecting only the necessary columns.
import pandas as pd
from datetime import datetime # needed for date filtering
data = pd.read_excel(r"E:\个人贷款客户信息表.xlsx", usecols=[1,4,6,7,8,9,10,11,12])
print(data.shape) # (50585, 9)3.2 Date Filtering
Keep only records whose contract effective date is after 2018‑12‑31 (i.e., 2019 data).
data = data[data["合同生效日"] > datetime(2018, 12, 31)]
print(data.shape) # (1673, 9)3.3 Split Tables
Create three sub‑tables, each containing purpose, loan amount, one unit column and its corresponding split ratio.
data1 = data[["用途", "贷款金额", "单位1", "分成比例1"]]
data2 = data[["用途", "贷款金额", "单位2", "分成比例2"]]
data3 = data[["用途", "贷款金额", "单位3", "分成比例3"]]3.4 Rename Columns and Concatenate
Standardise column names across the three tables and stack them vertically.
data1 = data1.rename(columns={"单位1": "单位", "分成比例1": "分成比例"})
data2 = data2.rename(columns={"单位2": "单位", "分成比例2": "分成比例"})
data3 = data3.rename(columns={"单位3": "单位", "分成比例3": "分成比例"})
data4 = pd.concat([data1, data2, data3], ignore_index=True)3.5 Handle Missing Values
Drop rows where unit or split‑ratio is missing.
data4 = data4.dropna()
print(data4.info())3.6 Insert Calculated Columns
Convert split ratio to a percentage and compute the split loan amount (in ten‑thousands of yuan).
# Insert percentage column
data4.insert(2, "分成百分比", data4["分成比例"] / 100)
# Compute split loan amount
data4["分成贷款金额"] = data4["贷款金额"] * data4["分成百分比"] / 100003.7 Pivot Table Generation
Create a pivot table that sums split loan amounts by unit and purpose.
data5 = data4[["单位", "用途", "分成贷款金额"]]
result = pd.pivot_table(
data5,
values="分成贷款金额",
index="单位",
columns="用途",
aggfunc="sum",
).fillna(0).reset_index()
print(result.head())The resulting DataFrame can be exported to Excel with to_excel() or further processed as needed.
By adjusting the date filter you can generate daily, weekly, or quarterly reports with a single click.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
