Fundamentals 6 min read

How to Perform One-Way ANOVA in Python: A Step-by-Step Guide

This article explains the concept of one‑way ANOVA, walks through a real‑world example comparing four manufacturing processes, and demonstrates how to conduct the analysis in Python using statsmodels, interpreting the resulting F‑statistic and p‑value to assess significance.

Model Perspective
Model Perspective
Model Perspective
How to Perform One-Way ANOVA in Python: A Step-by-Step Guide

One‑Way ANOVA

In a single‑factor experiment only factor A varies while all other conditions are kept constant. The goal is to infer whether different levels of A produce a statistically significant difference in the response variable by testing equality of the means of several independent normal populations.

Example

Four manufacturing processes (A1–A4) are used to produce light bulbs. The lifetimes of bulbs from each process are measured, producing the data shown below.
A1: 1620, 1670, 1700, 1750, 1800
A2: 1580, 1600, 1640, 1720
A3: 1460, 1540, 1620, 1680
A4: 1500, 1550, 1610

The data constitute four independent samples; we test the null hypothesis that the four population means are equal.

Using the decomposition of total sum of squares into between‑group and within‑group components, the ANOVA F‑statistic is computed. If the p‑value is below the chosen significance level (e.g., 0.05), the null hypothesis is rejected, indicating that factor A has a significant effect.

In practice the calculations are often performed with software. The following Python code uses statsmodels to carry out a one‑way ANOVA on the data.

<code>import numpy as np
import statsmodels.api as sm
y = np.array([1620,1670,1700,1750,1800,1580,1600,1640,1720,1460,1540,1620,1680,1500,1550,1610])
x = np.hstack([np.ones(5), np.full(4,2), np.full(4,3), np.full(3,4)])
d = {'x': x, 'y': y}
model = sm.formula.ols("y~C(x)", d).fit()
anova_res = sm.stats.anova_lm(model)
print(anova_res)
</code>

The output is:

df   sum_sq   mean_sq         F   PR(>F)
C(x)            3  60153.3  20051.1   3.72774  0.0420037
Residual       12  64546.7   5378.89       NaN        NaN

Since the p‑value (0.042) is less than 0.05, we reject the null hypothesis and conclude that the manufacturing process has a statistically significant impact on bulb lifetime.

Pythonstatisticshypothesis testingstatsmodelsone-way ANOVAANOVA
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.