Just Say No to More End-to-End Tests
While end-to-end tests seem appealing by mimicking real user scenarios, they often become unreliable, slow, and costly, leading teams to miss bugs, waste time, and struggle with feedback loops; a balanced testing strategy that emphasizes fast, reliable unit and integration tests, following the test pyramid, yields better quality and faster releases.
Theoretical End-to-End Testing
Relying heavily on end-to-end (E2E) tests may sound reasonable because they simulate real user flows, aligning with the principle "focus on the user and everything else will follow." Developers, managers, and testers often favor this approach for different reasons, but in practice it can be problematic.
Actual End-to-End Testing
Consider a team building an online document‑editing service similar to Google Docs. Each night they build the latest version, deploy it to a test environment, run all E2E tests, and email a summary report. With a deadline looming and a rule that at least 90% of E2E tests must pass, the following timeline illustrates what went wrong:
Days to Deadline
Pass Rate
What Happened That Day
1
5%
Everything broke; login was down, causing almost all tests to fail.
0
4%
A partner team deployed a low‑quality build to their test environment.
-1
54%
A developer broke the "save" feature; half the tests required saving documents.
-2
54%
The bug was identified as a front‑end issue after half a day of investigation.
-3
54%
An incorrect fix was submitted but quickly replaced with a correct one.
-4
1%
Hardware failure in the data‑center hosting the test environment.
-5
84%
Many small bugs were hidden behind a large bug (e.g., login failures, save failures).
-6
87%
Target of >90% pass rate was not reached for unknown reasons.
-7
89.54%
Rounded to 90%; no patches were submitted, indicating flaky tests caused the failure.
Analysis
The E2E strategy uncovered real defects, but it also introduced several problems:
Team delayed a week to meet the coding milestone, working overtime.
Root‑cause analysis of failing E2E tests was painful and time‑consuming.
Partner‑team issues and hardware failures corrupted several days of results.
Large bugs masked many smaller ones.
E2E tests proved unreliable.
Developers often had to wait until the next day to verify a fix.
Thus, while the strategy finds bugs, it fails to provide fast, reliable feedback.
The Real Value of Testing
A failing test does not directly benefit users; only a fixed bug does. The value chain is: a test fails → the bug is discovered → the bug is fixed → user value increases.
Stage
Failed Test?
Bug Found?
Bug Fixed?
Does it add value?
No
No
Yes
Establish the Right Feedback Loop
An effective feedback loop must be:
Fast – developers should not wait hours or days for results.
Reliable – flaky tests erode trust and are ignored.
Isolating failures – pinpointing the exact code line that caused the bug.
Think Smaller, Not Bigger
Instead of large, brittle E2E suites, focus on smaller, faster tests:
1. Unit Tests
Fast – a fraction of a second is acceptable.
Reliable – isolated code paths are less prone to instability.
Isolate failures – a failing unit test points directly to the problematic unit.
2. Unit Tests vs. End-to-End Tests
E2E tests require building the whole product, deploying it, and then running the suite, which makes them slow and fragile. While they model real user scenarios well, their drawbacks often outweigh the benefits.
Unit tests alone cannot guarantee that components work together; integration tests fill that gap by testing a small group of units as a whole.
Test Pyramid
The test pyramid visualizes the ideal balance: the majority of tests are fast, reliable unit tests at the base, a smaller set of integration tests in the middle, and a thin layer of E2E tests at the top.
Google often recommends a 70/20/10 split (70% unit, 20% integration, 10% E2E), though teams should adapt to their context.
Avoid anti‑patterns such as:
Inverted pyramid / ice‑cream cone – over‑reliance on E2E tests, few unit tests.
Sandwich – many unit tests at the bottom, many E2E tests at the top, but too few integration tests in the middle.
Just as a physical pyramid is a stable structure, the test pyramid is the most reliable testing strategy.
Further Reading
Test Pyramid and Its Anti‑Patterns
Using Hermetic Servers for End‑to‑End Testing
Google's "Toilet Test" Series (Part 2): Writing Good End‑to‑End Tests
Original Author: Mike Wacker
Original Link: Just Say No to More End-to-End Tests
Published: April 22, 2015
Continuous Delivery 2.0
Tech and case studies on organizational management, team management, and engineering efficiency
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.