Python xlwings & pandas tutorials for batch sorting, summarizing, and statistical analysis of Excel workbooks
This article presents a series of Python examples using xlwings and pandas to batch‑sort worksheets, filter and aggregate data across multiple workbooks, compute summary statistics, perform correlation, ANOVA, regression, and generate pivot tables and visualizations, illustrating practical Excel automation and data‑analysis techniques.
The guide demonstrates how to automate Excel tasks with xlwings and pandas . It starts with a script that opens a workbook, reads each sheet into a DataFrame, sorts the "销售利润" column in ascending order, writes the result back, and saves the file:
import xlwings as xw
import pandas as pd
app = xw.App(visible=True, add_book=False)
workbook = app.books.open(r'C:\...\产品销售统计表.xlsx')
for i in workbook.sheets:
values = i.range('A1').expand().options(pd.DataFrame).value
result = values.sort_values(by='销售利润')
i.range('A1').value = result
workbook.save()
workbook.close()
app.quit()Subsequent examples extend the concept to multiple workbooks, filtering data, grouping by product, and writing aggregated results to new sheets. One script merges all sheets, groups by "采购物品", and writes each group to a separate worksheet with a sum row.
import os, xlwings as xw, pandas as pd
app = xw.App(visible=True, add_book=False)
file_path = r'C:\...\采购表'
for file in os.listdir(file_path):
if file.endswith('.xlsx'):
wb = app.books.open(os.path.join(file_path, file))
# process each sheet ...
wb.save(); wb.close()
app.quit()Statistical analysis sections show how to compute descriptive statistics, correlation matrices, and perform ANOVA using pandas , statsmodels , and sklearn . For example, a correlation matrix is obtained with:
import pandas as pd
df = pd.read_excel(r'C:\...\相关性分析.xlsx', index_col='代理商编号')
result = df.corr()
print(result)ANOVA is performed by reshaping the data with melt() , fitting an OLS model, and calling anova_lm :
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
model = ols('Value~C(Treat)', data=df_melt).fit()
anova_table = anova_lm(model, typ=3)
print(anova_table)Regression examples use sklearn.linear_model.LinearRegression to fit a model predicting car sales from advertising spend, then compute the regression equation and a forecast for given ad budgets.
from sklearn import linear_model
model = linear_model.LinearRegression()
model.fit(X, y)
print('y = {:.2f} + {:.2f}*x1 + {:.2f}*x2'.format(model.intercept_, *model.coef_))
forecast = model.intercept_ + 20*model.coef_[0] + 30*model.coef_[1]
print('Forecast:', forecast)Visualization snippets illustrate creating pivot tables, box plots, and histograms with matplotlib , then inserting the figures back into Excel worksheets via xlwings .
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.