Why Identical Statistics Can Hide Very Different Data: The Lesson of Anscombe’s Quartet
Anscombe’s Quartet shows that four data sets can share identical means, variances, regression lines and correlation coefficients yet display completely different scatter‑plot shapes, highlighting why visualisation is crucial and why relying only on summary statistics can mislead analysts.
Anscombe’s Quartet
Anscombe’s Quartet, introduced by statistician Francis Anscombe in 1973, consists of four distinct 2‑dimensional data sets, each containing 11 (x, y) pairs. Although their statistical summaries—means, variances, regression line y = 3 + 0.5x, and correlation coefficient ≈0.67—are virtually identical, their visual patterns differ dramatically.
Statistical characteristics shared by the four sets:
Mean : average x = 9, average y = 7.5.
Variance : x variance ≈ 11, y variance ≈ 4.1.
Linear regression : same regression equation y = 3 + 0.5x with r² ≈ 0.67.
Plotting Reveals the Truth
Scatter‑plot visualizations show distinct shapes:
Dataset 1: points lie close to a straight line, a typical linear distribution.
Dataset 2: despite the same regression result, points follow a clear curved pattern, exposing a non‑linear relationship.
Dataset 3: most points align on a line but one obvious outlier heavily influences the regression.
Dataset 4: all points share the same x value, offering virtually no horizontal variation; the regression line is misleading because a single special point forces the same equation as the other sets.
Takeaway
The quartet demonstrates that relying solely on summary statistics such as means, variances, or correlation coefficients can be deceptive. Visual inspection is essential to uncover underlying patterns, outliers, or non‑linear relationships that numbers alone may hide. In practice, always complement statistical analysis with appropriate visualizations.
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.