Databases 9 min read

How MySQL Chooses the Cheapest Index for COUNT(*) and When It Gets It Wrong

This article examines whether SELECT COUNT(*) causes full‑table scans, explains MySQL’s optimizer cost‑based index selection (including IO and CPU costs), demonstrates with a 100k‑row table how auxiliary indexes are chosen, and shows cases where the optimizer’s estimates mislead performance.

dbaplus Community

Apr 10, 2024

How MySQL Chooses the Cheapest Index for COUNT(*) and When It Gets It Wrong

Many wonder if SELECT COUNT(*) without a WHERE clause forces a full‑table scan. MySQL 5.6+ can optimize such queries by picking the cheapest auxiliary index, making COUNT(*) as fast as possible.

SQL Index Cost Calculation

The optimizer evaluates two main costs:

IO cost : reading a data page from disk, defaulted to 1 per page. MySQL reads whole pages, not individual rows, following the principle of locality.

CPU cost : processing rows after they are in memory, defaulted to 0.2 per row.

Example Demonstration

We create a table person (MySQL 5.7.18) with a primary key and two secondary indexes: name_score and create_time.

CREATE TABLE `person` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `score` int(11) NOT NULL,
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `name_score` (`name`(191),`score`),
  KEY `create_time` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

We insert 100,000 rows via a stored procedure:

CREATE PROCEDURE insert_person()
BEGIN
  DECLARE c_id INT DEFAULT 1;
  WHILE c_id <= 100000 DO
    INSERT INTO person VALUES (c_id, CONCAT('name',c_id), c_id+100,
      DATE_SUB(NOW(), INTERVAL c_id SECOND));
    SET c_id = c_id + 1;
  END WHILE;
END;

Running EXPLAIN SELECT COUNT(*) FROM person shows MySQL uses the create_time auxiliary index:

EXPLAIN SELECT COUNT(*) FROM person

When we query with conditions that could use either index, MySQL chooses a full‑table scan:

SELECT * FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';

Even a covering‑index query still results in a full scan:

SELECT create_time FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';

Execution times show the forced index is twice as fast (2 ms vs 4 ms), indicating the optimizer’s cost estimate was off.

We compute the full‑scan cost manually:

Rows ≈ 100,264 → CPU cost = 100,264 × 0.2 = 20,052.8

Data length 5,783,552 bytes → pages = 5,783,552 / 16 KB ≈ 353 → IO cost = 353

Total cost ≈ 20,406

Using optimizer_trace we see the estimated costs:

{
  "index": "name_score",
  "rows": 25372,
  "cost": 30447
}

{
  "index": "create_time",
  "rows": 50132,
  "cost": 60159
}

{
  "access_type": "scan",
  "rows_to_scan": 100264,
  "cost": 20406,
  "chosen": true
}

The optimizer correctly picks the lowest estimated cost (full scan), but actual runtime shows the forced index is faster, highlighting inaccuracies in statistics or cost modeling.

Conclusion

The optimizer’s plan is not always optimal; inaccurate row statistics or cost formulas can lead to sub‑optimal choices. In production, use EXPLAIN and optimizer_trace to verify and tune queries, especially when multiple indexes are available.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Query Optimization mysql COUNT Optimizer_trace Index Cost

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.