Operations 16 min read

Understanding Software Quality Metrics and the Pitfalls of Misguided Measurement

The article uses a humorous story about two development teams manipulating KLOC defect rates to illustrate Goodhart's law and the McNamara fallacy, then presents a comprehensive set of software quality metrics and guidelines for selecting meaningful measurements that truly improve product quality.

DevOps
DevOps
DevOps
Understanding Software Quality Metrics and the Pitfalls of Misguided Measurement

Bill Gates once compared measuring software productivity by lines of code to measuring an aircraft's progress by its weight, highlighting the danger of using inappropriate metrics such as the thousand‑line‑code defect rate; a joke about this metric sets the stage for the discussion.

Two development teams try to game the KLOC defect metric: Team A first rewrites well‑designed code in a primitive style to inflate the denominator, then Team B adds the entire Tomcat source tree to the project, both causing the codebase to become unmaintainable and ultimately leading to project failure and company closure.

The narrative introduces Goodhart's law, explaining that once a metric becomes a target it loses its usefulness, illustrated by an Indian cobra‑bounty story where rewarding snake kills creates perverse incentives that worsen the problem.

It then describes the McNamara fallacy – measuring what is easy rather than what is important – outlining its four steps: measuring the easy, ignoring the hard, assuming the hard is unimportant, and denying the existence of the hard.

To counteract metric gaming, the article suggests designing "lie‑detector" questions that verify whether a metric truly reflects the intended behavior.

A detailed list of quality measurement indicators follows, including card opening and verification success rates, SonarQube technical debt categories, unit‑test and code‑coverage thresholds, defect counts (offline and online), release failure rate, defect detection percentage, defect resolution rate, mean time between failures, defect density, average change lead time, build queue time, daily merge counts, average code changes per release, patch frequency, and mean time to repair, each explained with its purpose and ideal values.

The conclusion warns that metric selection must avoid the traps of Goodhart's law and the McNamara fallacy, emphasizing that metrics should be chosen to suit the team's context and to provide real insight rather than merely incentivizing gaming.

TestingDevOpssoftware metricsGoodhart's lawquality measurementMcNamara fallacy
DevOps
Written by

DevOps

Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.