Databases 20 min read

Master MySQL Index Optimization: Principles, Explain, and Sharding Strategies

This article provides a comprehensive guide to MySQL index optimization, covering selectivity calculation, prefix length, composite index rules, EXPLAIN analysis, common slow‑query patterns, and practical sharding and partitioning techniques to improve query performance.

Architecture & Thinking
Architecture & Thinking
Architecture & Thinking
Master MySQL Index Optimization: Principles, Explain, and Sharding Strategies

1 Background

As an owner and senior interviewer on the front‑line team, we interview developers mainly on server‑side stacks (Java, Go) covering core language knowledge, databases, cache, message middleware, and micro‑service frameworks. MySQL is the most common data store in large companies, with index usage and optimization being the focus.

2 Index Optimization Steps

2.1 Principles of Efficient Indexes

Understand and calculate the selectivity of index columns; high selectivity allows fast data location, while low selectivity leads to full‑page scans. Choose columns with high selectivity when creating indexes.

Understand and calculate prefix index length; the appropriate length balances high selectivity and storage size. Length 6 is often optimal.

selectivity = count(distinct c_name)/count(*)
select count(distinct left(c_name, calcul_len)) / count(*) from t_name;
mysql> SELECT
    count(DISTINCT LEFT(empname, 3)) / count(*) AS sel3,
    count(DISTINCT LEFT(empname, 4)) / count(*) AS sel4,
    count(DISTINCT LEFT(empname, 5)) / count(*) AS sel5,
    count(DISTINCT LEFT(empname, 6)) / count(*) AS sel6,
    count(DISTINCT LEFT(empname, 7)) / count(*) AS sel7
FROM emp;
+--------+--------+--------+--------+--------+
| sel3   | sel4   | sel5   | sel6   | sel7   |
+--------+--------+--------+--------+--------+
| 0.0012 | 0.0076 | 0.0400 | 0.1713 | 0.1713 |
+--------+--------+--------+--------+--------+
1 row in set

Follow the left‑most matching rule for composite indexes; MySQL stops matching when it encounters a range condition (>, <, BETWEEN, LIKE). Example: with index (depno, empname, job), conditions on empname and job are not used.

Use “need‑based” selection: avoid SELECT *, retrieve only required columns, and prefer covering indexes to reduce row look‑ups.

Determine whether a composite index is used; consider index‑condition pushdown to further reduce row look‑ups.

Avoid index loss: do not apply functions or operators to indexed columns, and keep column data “clean”.

Avoid unnecessary type conversion; comparing a string column with a numeric value disables the index.

LIKE ‘%value%’ invalidates the index; LIKE ‘value%’ can use the index.

Include sorting columns in the index to avoid extra sorting steps.

Extend existing indexes when possible instead of creating new ones.

Index order does not need to match query order; the optimizer can reorder, though keeping a logical order helps readability.

2.2 Query Optimizer – EXPLAIN

The EXPLAIN command shows execution plan details; the “rows” column is the key metric—smaller rows usually mean faster execution. Optimization aims to reduce rows.

2.2.1 EXPLAIN Output Fields

Column

JSON Name

Meaning

id select_id

The SELECT

identifier

select_type

None

The SELECT type

table table_name

The table for the output row

partitions

partitions

The matching partitions

type

access_type

The join type

possible_keys

possible_keys

The possible indexes to choose

key

key

The index actually chosen

key_len

key_length

The length of the chosen key

ref

ref

The columns compared to the index

rows

rows

Estimate of rows to be examined

filtered

filtered

Percentage of rows filtered by table condition

Extra

None

Additional information

2.2.2 select_type Enum

Key parameters for optimization: possible_keys, key, rows, select_type.

select_type: type of each SELECT (Simple, Primary, Dependent Subquery, etc.)

possible_keys: indexes MySQL could use.

key: the index actually chosen (null if none).

rows: estimated number of rows scanned.

Slow query optimization steps:

Run the query with SQL_NO_CACHE to confirm slowness.

Prioritize high‑selectivity conditions on the table with the smallest result set.

Use EXPLAIN to verify the plan matches expectations.

Prefer ORDER BY … LIMIT patterns that let MySQL stop early.

Adjust based on business usage scenarios.

Follow the ten index‑building principles when adding indexes.

Iterate analysis if results are unsatisfactory.

2.3 Query Case Analysis

Examples demonstrate how to analyze and optimize slow queries, including a join on emp and dep tables, index selectivity calculations, and redesigning indexes to match the left‑most rule and covering requirements.

Another case shows a complex query aggregating user consumption across four categories. Although a composite index (usercode, gravalue, logdate) exists, the query still takes ~13 seconds due to DEPENDENT SUBQUERY execution, which repeats sub‑queries for each distinct user.

Solution: replace dependent sub‑queries with joins or separate queries to avoid repeated scans.

3 Appropriate Sharding and Partitioning

Physical resources are limited; high concurrency can cause performance bottlenecks. Splitting large tables (vertical, horizontal, or both) distributes load across multiple machines.

3.1 Vertical Sharding

Separate logical domains into different databases, e.g., Products, Orders, Scores.

3.2 Vertical Table Splitting

Move rarely used columns to an auxiliary table while keeping frequently accessed columns in the main table.

3.3 In‑Database Table Partitioning

Divide a large table into partitions based on a strategy.

3.4 Full Sharding

Combine in‑database partitioning with moving partitions to different hosts to fully utilize CPU, memory, and I/O resources.

4 Complete Index Knowledge System

Reference the author’s previous series on indexes and sharding for deeper study.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

shardingmysqlquery-performanceexplain
Architecture & Thinking
Written by

Architecture & Thinking

🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.