Fundamentals 22 min read

Python xlwings & pandas tutorials for batch sorting, summarizing, and statistical analysis of Excel workbooks

This article presents a series of Python examples using xlwings and pandas to batch‑sort worksheets, filter and aggregate data across multiple workbooks, compute summary statistics, perform correlation, ANOVA, regression, and generate pivot tables and visualizations, illustrating practical Excel automation and data‑analysis techniques.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Python xlwings & pandas tutorials for batch sorting, summarizing, and statistical analysis of Excel workbooks

The guide demonstrates how to automate Excel tasks with xlwings and pandas . It starts with a script that opens a workbook, reads each sheet into a DataFrame, sorts the "销售利润" column in ascending order, writes the result back, and saves the file:

import xlwings as xw
import pandas as pd
app = xw.App(visible=True, add_book=False)
workbook = app.books.open(r'C:\...\产品销售统计表.xlsx')
for i in workbook.sheets:
    values = i.range('A1').expand().options(pd.DataFrame).value
    result = values.sort_values(by='销售利润')
    i.range('A1').value = result
workbook.save()
workbook.close()
app.quit()

Subsequent examples extend the concept to multiple workbooks, filtering data, grouping by product, and writing aggregated results to new sheets. One script merges all sheets, groups by "采购物品", and writes each group to a separate worksheet with a sum row.

import os, xlwings as xw, pandas as pd
app = xw.App(visible=True, add_book=False)
file_path = r'C:\...\采购表'
for file in os.listdir(file_path):
    if file.endswith('.xlsx'):
        wb = app.books.open(os.path.join(file_path, file))
        # process each sheet ...
        wb.save(); wb.close()
app.quit()

Statistical analysis sections show how to compute descriptive statistics, correlation matrices, and perform ANOVA using pandas , statsmodels , and sklearn . For example, a correlation matrix is obtained with:

import pandas as pd
df = pd.read_excel(r'C:\...\相关性分析.xlsx', index_col='代理商编号')
result = df.corr()
print(result)

ANOVA is performed by reshaping the data with melt() , fitting an OLS model, and calling anova_lm :

from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
model = ols('Value~C(Treat)', data=df_melt).fit()
anova_table = anova_lm(model, typ=3)
print(anova_table)

Regression examples use sklearn.linear_model.LinearRegression to fit a model predicting car sales from advertising spend, then compute the regression equation and a forecast for given ad budgets.

from sklearn import linear_model
model = linear_model.LinearRegression()
model.fit(X, y)
print('y = {:.2f} + {:.2f}*x1 + {:.2f}*x2'.format(model.intercept_, *model.coef_))
forecast = model.intercept_ + 20*model.coef_[0] + 30*model.coef_[1]
print('Forecast:', forecast)

Visualization snippets illustrate creating pivot tables, box plots, and histograms with matplotlib , then inserting the figures back into Excel worksheets via xlwings .

machine learningPythonstatisticsdata analysispandasExcel Automationxlwings
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.