Backend Development 11 min read

Why MySQL Sometimes Chooses Full Table Scans Over Indexes—and How to Diagnose It

This article explains how MySQL's optimizer calculates execution costs for different indexes, why it may select a full table scan instead of an apparently better index, and shows how to use EXPLAIN and optimizer trace to reveal and improve the chosen execution plan.

macrozheng
macrozheng
macrozheng
Why MySQL Sometimes Chooses Full Table Scans Over Indexes—and How to Diagnose It

Preface

Many wonder whether a simple

SELECT COUNT(*)

will inevitably cause a full‑table scan and slow query.

<code>SELECT COUNT(*) FROM SomeTable;</code>

There is a claim that, when the

WHERE

clause is absent, MySQL can use an auxiliary index to count rows, and that this optimization yields the best performance. The claim is true for MySQL 5.6 and later.

SQL Index Cost Calculation

MySQL chooses the execution plan with the lowest estimated cost, which consists of two main components:

IO cost : reading a data page from disk costs 1 unit. MySQL reads whole pages, so the cost depends on the number of pages accessed.

CPU cost : evaluating rows and sorting costs 0.2 units per row.

To illustrate, we create a test table (MySQL 5.7.18) and populate it with 100 000 rows:

<code>CREATE TABLE `person` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `score` int(11) NOT NULL,
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `name_score` (`name`(191),`score`),
  KEY `create_time` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;</code>
<code>CREATE PROCEDURE insert_person()
BEGIN
  DECLARE c_id INT DEFAULT 1;
  WHILE c_id <= 100000 DO
    INSERT INTO person VALUES (c_id, CONCAT('name',c_id), c_id+100,
      DATE_SUB(NOW(), INTERVAL c_id SECOND));
    SET c_id = c_id + 1;
  END WHILE;
END;</code>

After inserting the data we run

EXPLAIN

on a simple count query:

<code>EXPLAIN SELECT COUNT(*) FROM person;</code>

The optimizer selects the

create_time

secondary index, confirming that MySQL can use an auxiliary index for

COUNT(*)

.

Next we test a query with a

WHERE

clause that could use either

name_score

or

create_time

:

<code>SELECT * FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';</code>

MySQL chooses a full table scan, even though both indexes could satisfy the conditions. For comparison we force the

create_time

index:

<code>SELECT create_time FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18' FORCE INDEX(create_time);</code>

Execution times show the forced index is roughly twice as fast (2 ms vs 4 ms), indicating the optimizer’s cost estimate was inaccurate.

Cost Computation

Using table statistics we compute:

Rows ≈ 100 264 → CPU cost = 100 264 × 0.2 = 20 052.8

Data length = 5 783 552 bytes, page size = 16 KB → pages ≈ 353 → IO cost = 353

Total full‑table‑scan cost = 20 052.8 + 353 ≈ 20 406.

Optimizer‑trace reveals the estimated costs for each access path:

<code>{"index":"name_score","rows":25372,"cost":30447}</code>
<code>{"index":"create_time","rows":50132,"cost":60159}</code>
<code>{"access_type":"scan","rows":100264,"cost":20406}</code>

Since 20 406 is the smallest, the optimizer selects the full scan.

Using Optimizer Trace

Enable the trace to see detailed cost breakdowns:

<code>SET optimizer_trace="enabled=on";
SELECT create_time FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace="enabled=off";</code>

The trace lists the cost for each candidate plan, helping you understand why a sub‑optimal plan was chosen and where statistics may need improvement.

Conclusion

MySQL’s optimizer does not always pick the plan we expect; inaccurate statistics or differing cost models can lead to full‑table scans even when indexes exist. In production, always verify execution plans with

EXPLAIN

and

optimizer_trace

before relying on the optimizer’s choice.

MySQLIndex Optimizationquery performanceEXPLAINoptimizer trace
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.