Why Does MySQL Pick the Wrong Index? A Deep Dive into Query Optimization
This article examines a slow MySQL query that scans millions of rows, explains why the optimizer chose a low‑selectivity index, demonstrates how forcing the correct index dramatically improves performance, and shows how to create a covering composite index to resolve the issue permanently.
Overview
A slow query that counts content_id occurrences in the last seven days took 9.35 seconds. The MySQL optimizer chose the index idx_channel_source_id, which has low selectivity, instead of the more appropriate idx_classify_time.
Optimization Process
Identifying the Issue
Table schema (≈4 million rows):
CREATE TABLE `demo_table` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` varchar(100) NOT NULL DEFAULT '',
`source_channel` int(11) NOT NULL DEFAULT '0',
`source_id` bigint(20) NOT NULL DEFAULT '0',
`category_id` bigint(20) NOT NULL DEFAULT '0',
`classify_time` bigint(20) NOT NULL DEFAULT '0',
`content_id` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_category_id` (`category_id`),
KEY `idx_classify_time` (`classify_time`),
KEY `idx_channel_source_id` (`source_channel`,`source_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;Original query (SQL‑1) counts content_id in the recent seven days:
select content_id, count(1) as c
from demo_table
where source_channel = 2
and classify_time between 1556019882000 and 1556624682000
and content_id is not null
group by content_id;Execution time: 9.35 s. EXPLAIN output:
Key: idx_channel_source_id Type: ref Rows examined: 1,487,434
Extra: Using where; Using temporary; Using filesort
Forcing the index idx_classify_time (SQL‑2) reduces the time to 2.19 s, confirming the original plan is sub‑optimal.
The optimizer ranks index access types as all → index → range → ref → eq_ref → const. The chosen ref type is less efficient than the range type used by idx_classify_time. Additionally, the selected index has a smaller key_len, allowing more rows per page.
Solutions
Two ways to address the problem:
Explicitly force the desired index in the query (e.g., FORCE INDEX(idx_classify_time)).
Create a more suitable composite covering index.
Back‑Table (Lookup) Concept
In MySQL, a clustered (primary) index stores full rows in its leaf nodes. A secondary index stores only the indexed columns plus the primary key. When a query needs columns not present in the secondary index, MySQL performs a second lookup—called a back‑table—to the primary index to retrieve the full row.
Covering Index
By building a composite index that includes all columns required by the query ( content_id, classify_time, source_channel), the back‑table step is eliminated. This creates a covering index that can satisfy the query directly.
alter table demo_table
add index idx_content_id_classify_time_source_channel
(content_id, classify_time, source_channel);After adding the index, re‑executing SQL‑1 finishes in ~0.02 seconds. New EXPLAIN output:
Key: idx_channel_classify_time_content_id Type: range Extra: Using where; Using index; Using temporary; Using filesort
The presence of Using index indicates the query is satisfied entirely from the covering index, avoiding extra lookups.
Conclusion
Creating an appropriate composite covering index resolves the optimizer’s mis‑selection and dramatically improves query performance, turning a multi‑second scan into a few‑hundred‑microsecond operation.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
