Databases 10 min read

How MySQL Chooses Indexes: Cost Calculation and Optimization Techniques

This article explains how MySQL calculates execution costs for different indexes using IO and CPU metrics, demonstrates with a sample table and queries, compares the optimizer's choices with actual performance, and shows how to use EXPLAIN and optimizer trace to fine‑tune query plans.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
How MySQL Chooses Indexes: Cost Calculation and Optimization Techniques

In MySQL, the optimizer selects the index with the lowest estimated execution cost, which consists of IO cost (pages read from disk) and CPU cost (row processing). By default, reading a page costs 1 and checking a row costs 0.2.

The article creates a sample person table (MySQL 5.7) with a primary key and two secondary indexes ( name_score and create_time ), inserts 100,000 rows via a stored procedure, and runs various queries to observe index selection.

CREATE TABLE `person` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `score` int(11) NOT NULL,
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `name_score` (`name`(191),`score`),
  KEY `create_time` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Running EXPLAIN SELECT COUNT(*) FROM person shows MySQL uses the create_time secondary index because its estimated cost is lower than scanning the primary key.

EXPLAIN SELECT COUNT(*) FROM person;

When querying with conditions on NAME and create_time , MySQL initially chooses a full table scan, even though a covering index exists. Forcing the create_time index reduces execution time from ~4 ms to ~2 ms, revealing that the optimizer's cost estimate (20406) was lower than the index‑based plans (30447 and 60159).

-- Full table scan (4.0 ms)
SELECT create_time FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';

-- Forced index (2.0 ms)
SELECT create_time FROM person FORCE INDEX(create_time) WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';

The article demonstrates how to calculate these costs manually using SHOW TABLE STATUS LIKE 'person' to obtain row count and data length, then applying the IO and CPU formulas.

SHOW TABLE STATUS LIKE 'person';

Finally, it introduces the optimizer_trace feature (available from MySQL 5.6) to view detailed cost breakdowns for each access path, helping developers understand why the optimizer prefers one plan over another and how to adjust queries or indexes for better performance.

SET optimizer_trace="enabled=on";
SELECT create_time FROM person WHERE NAME > 'name84059' AND create_time > '2020-05-23 14:39:18';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace="enabled=off";

The conclusion emphasizes that MySQL's chosen execution plan may not always be optimal due to inaccurate statistics or cost assumptions, and recommends using EXPLAIN and optimizer_trace proactively to verify and improve query performance.

MySQLIndex OptimizationEXPLAINoptimizer traceSQL Cost Model
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.