Databases 18 min read

Mastering MySQL Indexes: When to Use, Combine, and Optimize Them

This article explains why indexes are vital for MySQL performance, how to decide when to add ordinary, composite, prefix, or unique indexes, the pitfalls of using non‑sequential primary keys, and advanced optimizations such as change buffer, index condition pushdown, and MRR to reduce I/O.

dbaplus Community

Nov 28, 2021

Mastering MySQL Indexes: When to Use, Combine, and Optimize Them

Indexes are the heart of a database; without them MySQL behaves like a plain file system. The article explores MySQL B+‑tree indexes from practical and performance perspectives, covering common pitfalls and advanced optimization techniques.

1. Reasonable Use of Indexes

Before adding an index, check whether the column appears frequently in WHERE clauses. Consider composite indexes when queries filter on multiple columns, e.g., a user table with name, ID card, and address fields.

2. Drawbacks of Ordinary Indexes

Querying by a unique ID card typically uses two B+‑tree traversals: first the secondary index to find the primary key, then the primary‑key index to fetch the row. If each tree has height 3, the operation may require four random disk I/Os (≈40 ms), which is not fast.

SELECT name FROM user WHERE id_card=xxx

3. Primary‑Key Index Traps

Using a non‑sequential value such as an ID card as the primary key forces InnoDB to store each row on a 16 KB page. Inserting a value that breaks the logical order may cause page splits, data movement, and additional random I/Os. Moreover, ID cards require BIGINT, consuming more space than an INT primary key.

4. Composite Indexes: Benefits and Drawbacks

Because ID card is not suitable as a primary key, create a composite index (id_card, name) following the left‑most rule. To reduce space, replace the 8‑byte ID card with a 4‑byte CRC32 hash and add a crc_id_card column.

crc32.ChecksumIEEE([]byte("341124199408203232"))

Trade‑offs include extra CPU for CRC calculation, an additional column, and possible hash collisions that require post‑filtering.

5. Prefix Indexes

For long string columns, a prefix index stores only the first N characters, saving space while still providing index benefits. Performance may degrade if the prefix has high duplication.

ALTER TABLE xx ADD INDEX(name(7));  # index first 7 characters of name

6. Unique Index Speed vs. Ordinary Index

Ordinary indexes must continue scanning leaf nodes to find the first non‑matching value, while a unique index can stop immediately after locating the single matching row, making lookups faster. However, unique indexes add overhead on inserts because uniqueness must be checked, and they do not use the change buffer.

7. Do Not Add Indexes Blindly

Adding an index on a column that appears in WHERE does not guarantee its use. For example, an index on a gender column (≈50 % selectivity) is often ignored by the optimizer, leading to costly back‑table lookups and excessive I/O.

8. Index Ineffectiveness Due to Type Conversion

If a numeric indexed column is compared with a string literal, MySQL converts the string to a number and can still use the index. The opposite conversion (string column compared with a numeric literal) does not happen, causing the index to be ignored because the B+‑tree is built on the original string values.

9. Index Optimizations

9.1 Change Buffer

When a page is not in memory during an update, the change is written to the change buffer instead of reading the page. A background thread later merges buffered changes to disk. Only non‑unique, non‑primary indexes benefit; primary and unique indexes always read the page.

9.2 Index Condition Pushdown (ICP)

Before MySQL 5.6, a left‑most prefix index would still require a back‑table lookup for the second column. ICP pushes the second column’s condition down to the index scan, reducing back‑table calls.

9.3 Refresh Adjacent Pages

Dirty pages are flushed to disk periodically. If adjacent pages are also dirty, they can be flushed together to reduce random I/O, though this may cause extra work if the neighbor becomes dirty again immediately.

9.4 Multi‑Range Read (MRR)

MRR buffers primary‑key IDs from an auxiliary index, sorts them, and then reads the primary index sequentially, dramatically cutting random I/O. Example: querying grade BETWEEN 60 AND 70 benefits from MRR by reading each page only once.

Overall, effective index design requires understanding query patterns, data distribution, and storage engine internals to avoid costly back‑table lookups, unnecessary page splits, and excessive I/O.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance SQL MySQL database optimization Index B+Tree

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.