Master MySQL Index Optimization: Principles, Explain, and Sharding Strategies
This article provides a comprehensive guide to MySQL index optimization, covering selectivity calculation, prefix length, composite index rules, EXPLAIN analysis, common slow‑query patterns, and practical sharding and partitioning techniques to improve query performance.
1 Background
As an owner and senior interviewer on the front‑line team, we interview developers mainly on server‑side stacks (Java, Go) covering core language knowledge, databases, cache, message middleware, and micro‑service frameworks. MySQL is the most common data store in large companies, with index usage and optimization being the focus.
2 Index Optimization Steps
2.1 Principles of Efficient Indexes
Understand and calculate the selectivity of index columns; high selectivity allows fast data location, while low selectivity leads to full‑page scans. Choose columns with high selectivity when creating indexes.
Understand and calculate prefix index length; the appropriate length balances high selectivity and storage size. Length 6 is often optimal.
selectivity = count(distinct c_name)/count(*)
select count(distinct left(c_name, calcul_len)) / count(*) from t_name;
mysql> SELECT
count(DISTINCT LEFT(empname, 3)) / count(*) AS sel3,
count(DISTINCT LEFT(empname, 4)) / count(*) AS sel4,
count(DISTINCT LEFT(empname, 5)) / count(*) AS sel5,
count(DISTINCT LEFT(empname, 6)) / count(*) AS sel6,
count(DISTINCT LEFT(empname, 7)) / count(*) AS sel7
FROM emp;
+--------+--------+--------+--------+--------+
| sel3 | sel4 | sel5 | sel6 | sel7 |
+--------+--------+--------+--------+--------+
| 0.0012 | 0.0076 | 0.0400 | 0.1713 | 0.1713 |
+--------+--------+--------+--------+--------+
1 row in setFollow the left‑most matching rule for composite indexes; MySQL stops matching when it encounters a range condition (>, <, BETWEEN, LIKE). Example: with index (depno, empname, job), conditions on empname and job are not used.
Use “need‑based” selection: avoid SELECT *, retrieve only required columns, and prefer covering indexes to reduce row look‑ups.
Determine whether a composite index is used; consider index‑condition pushdown to further reduce row look‑ups.
Avoid index loss: do not apply functions or operators to indexed columns, and keep column data “clean”.
Avoid unnecessary type conversion; comparing a string column with a numeric value disables the index.
LIKE ‘%value%’ invalidates the index; LIKE ‘value%’ can use the index.
Include sorting columns in the index to avoid extra sorting steps.
Extend existing indexes when possible instead of creating new ones.
Index order does not need to match query order; the optimizer can reorder, though keeping a logical order helps readability.
2.2 Query Optimizer – EXPLAIN
The EXPLAIN command shows execution plan details; the “rows” column is the key metric—smaller rows usually mean faster execution. Optimization aims to reduce rows.
2.2.1 EXPLAIN Output Fields
Column
JSON Name
Meaning
id select_id
The SELECT
identifier
select_type
None
The SELECT type
table table_name
The table for the output row
partitions
partitions
The matching partitions
type
access_type
The join type
possible_keys
possible_keys
The possible indexes to choose
key
key
The index actually chosen
key_len
key_length
The length of the chosen key
ref
ref
The columns compared to the index
rows
rows
Estimate of rows to be examined
filtered
filtered
Percentage of rows filtered by table condition
Extra
None
Additional information
2.2.2 select_type Enum
Key parameters for optimization: possible_keys, key, rows, select_type.
select_type: type of each SELECT (Simple, Primary, Dependent Subquery, etc.)
possible_keys: indexes MySQL could use.
key: the index actually chosen (null if none).
rows: estimated number of rows scanned.
Slow query optimization steps:
Run the query with SQL_NO_CACHE to confirm slowness.
Prioritize high‑selectivity conditions on the table with the smallest result set.
Use EXPLAIN to verify the plan matches expectations.
Prefer ORDER BY … LIMIT patterns that let MySQL stop early.
Adjust based on business usage scenarios.
Follow the ten index‑building principles when adding indexes.
Iterate analysis if results are unsatisfactory.
2.3 Query Case Analysis
Examples demonstrate how to analyze and optimize slow queries, including a join on emp and dep tables, index selectivity calculations, and redesigning indexes to match the left‑most rule and covering requirements.
Another case shows a complex query aggregating user consumption across four categories. Although a composite index (usercode, gravalue, logdate) exists, the query still takes ~13 seconds due to DEPENDENT SUBQUERY execution, which repeats sub‑queries for each distinct user.
Solution: replace dependent sub‑queries with joins or separate queries to avoid repeated scans.
3 Appropriate Sharding and Partitioning
Physical resources are limited; high concurrency can cause performance bottlenecks. Splitting large tables (vertical, horizontal, or both) distributes load across multiple machines.
3.1 Vertical Sharding
Separate logical domains into different databases, e.g., Products, Orders, Scores.
3.2 Vertical Table Splitting
Move rarely used columns to an auxiliary table while keeping frequently accessed columns in the main table.
3.3 In‑Database Table Partitioning
Divide a large table into partitions based on a strategy.
3.4 Full Sharding
Combine in‑database partitioning with moving partitions to different hosts to fully utilize CPU, memory, and I/O resources.
4 Complete Index Knowledge System
Reference the author’s previous series on indexes and sharding for deeper study.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
