Fundamentals 6 min read

How to Perform Two-Way ANOVA with Python’s statsmodels: Theory and Code

This article explains the theory behind two‑factor ANOVA, distinguishes cases with and without interaction, presents the mathematical model, and demonstrates a complete Python implementation using statsmodels, including data setup, model fitting, and interpretation of the ANOVA table.

Model Perspective

Jul 15, 2022

How to Perform Two-Way ANOVA with Python’s statsmodels: Theory and Code

Two-Factor ANOVA

When two factors may affect a response variable, a two‑factor ANOVA is used. The basic idea is to select several levels for each factor, conduct experiments for every combination of factor levels, and then analyze the variance of the collected data.

Mathematical Model

Assume factor A has a levels and factor B has b levels. For each level combination, the population follows a normal distribution. If n replicates are taken at each combination, the observed result Y_{ijk} follows a normal distribution and the observations are independent. The model can be written as:

Y_{ijk}=\mu+\alpha_i+\beta_j+(\alpha\beta)_{ij}+\epsilon_{ijk}

where \mu is the overall mean, \alpha_i is the effect of the i ‑th level of factor A, \beta_j is the effect of the j ‑th level of factor B, (\alpha\beta)_{ij} is the interaction effect, and \epsilon_{ijk} is random error.

Two-Way ANOVA Without Interaction

If prior knowledge indicates that the two factors do not interact, the experiment can be performed without replication, simplifying the analysis. The model reduces to: Y_{ij}=\mu+\alpha_i+\beta_j+\epsilon_{ij} The total sum of squares is decomposed into the sum of squares for factor A, factor B, and error. The test statistics are the ratios of each factor’s mean square to the error mean square. Under the null hypothesis, these follow an F‑distribution.

Two-Way ANOVA With Interaction

When interaction may exist, the full model includes the interaction term. The total sum of squares is partitioned into four components: factor A sum of squares, factor B sum of squares, interaction sum of squares, and error sum of squares. Each component’s mean square is compared with the error mean square to test significance.

Implementation with statsmodels

The example below uses a chemical process measured at three concentration levels and four temperature levels. It tests whether the yield differs significantly across temperatures (factor A), concentrations (factor B), and whether there is a significant interaction.

import numpy as np
import statsmodels.api as sm

y = np.array([[11,11,13,10],
              [10,11,9,12],
              [9,10,7,6],
              [7,8,11,10],
              [5,13,12,14],
              [11,14,13,10]]).flatten()

A = np.tile(np.arange(1,5), (6,1)).flatten()
B = np.tile(np.arange(1,4).reshape(3,1), (1,8)).flatten()

d = {'x1': A,
     'x2': B,
     'y': y}

model = sm.formula.ols("y~C(x1)+C(x2)+C(x1):C(x2)", d).fit()
# Note the syntax for interaction terms
anovat = sm.stats.anova_lm(model)
# Perform two‑factor ANOVA
print(anovat)

The resulting ANOVA table provides degrees of freedom, sum of squares, mean squares, F‑statistics, and p‑values for factor A, factor B, their interaction, and residual error, allowing conclusions about the significance of each effect.

Reference: 司守奎，孙玺菁 Python数学实验与建模

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python statistical analysis Experimental Design Statsmodels two-way ANOVA

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.