Mastering MySQL Indexes: When to Use, Combine, and Optimize Them
This article explains why indexes are vital for MySQL performance, how to decide when to add ordinary, composite, prefix, or unique indexes, the pitfalls of using non‑sequential primary keys, and advanced optimizations such as change buffer, index condition pushdown, and MRR to reduce I/O.
Indexes are the heart of a database; without them MySQL behaves like a plain file system. The article explores MySQL B+‑tree indexes from practical and performance perspectives, covering common pitfalls and advanced optimization techniques.
1. Reasonable Use of Indexes
Before adding an index, check whether the column appears frequently in WHERE clauses. Consider composite indexes when queries filter on multiple columns, e.g., a user table with name, ID card, and address fields.
2. Drawbacks of Ordinary Indexes
Querying by a unique ID card typically uses two B+‑tree traversals: first the secondary index to find the primary key, then the primary‑key index to fetch the row. If each tree has height 3, the operation may require four random disk I/Os (≈40 ms), which is not fast.
SELECT name FROM user WHERE id_card=xxx3. Primary‑Key Index Traps
Using a non‑sequential value such as an ID card as the primary key forces InnoDB to store each row on a 16 KB page. Inserting a value that breaks the logical order may cause page splits, data movement, and additional random I/Os. Moreover, ID cards require BIGINT, consuming more space than an INT primary key.
4. Composite Indexes: Benefits and Drawbacks
Because ID card is not suitable as a primary key, create a composite index (id_card, name) following the left‑most rule. To reduce space, replace the 8‑byte ID card with a 4‑byte CRC32 hash and add a crc_id_card column.
crc32.ChecksumIEEE([]byte("341124199408203232"))Trade‑offs include extra CPU for CRC calculation, an additional column, and possible hash collisions that require post‑filtering.
5. Prefix Indexes
For long string columns, a prefix index stores only the first N characters, saving space while still providing index benefits. Performance may degrade if the prefix has high duplication.
ALTER TABLE xx ADD INDEX(name(7)); # index first 7 characters of name6. Unique Index Speed vs. Ordinary Index
Ordinary indexes must continue scanning leaf nodes to find the first non‑matching value, while a unique index can stop immediately after locating the single matching row, making lookups faster. However, unique indexes add overhead on inserts because uniqueness must be checked, and they do not use the change buffer.
7. Do Not Add Indexes Blindly
Adding an index on a column that appears in WHERE does not guarantee its use. For example, an index on a gender column (≈50 % selectivity) is often ignored by the optimizer, leading to costly back‑table lookups and excessive I/O.
8. Index Ineffectiveness Due to Type Conversion
If a numeric indexed column is compared with a string literal, MySQL converts the string to a number and can still use the index. The opposite conversion (string column compared with a numeric literal) does not happen, causing the index to be ignored because the B+‑tree is built on the original string values.
9. Index Optimizations
9.1 Change Buffer
When a page is not in memory during an update, the change is written to the change buffer instead of reading the page. A background thread later merges buffered changes to disk. Only non‑unique, non‑primary indexes benefit; primary and unique indexes always read the page.
9.2 Index Condition Pushdown (ICP)
Before MySQL 5.6, a left‑most prefix index would still require a back‑table lookup for the second column. ICP pushes the second column’s condition down to the index scan, reducing back‑table calls.
9.3 Refresh Adjacent Pages
Dirty pages are flushed to disk periodically. If adjacent pages are also dirty, they can be flushed together to reduce random I/O, though this may cause extra work if the neighbor becomes dirty again immediately.
9.4 Multi‑Range Read (MRR)
MRR buffers primary‑key IDs from an auxiliary index, sorts them, and then reads the primary index sequentially, dramatically cutting random I/O. Example: querying grade BETWEEN 60 AND 70 benefits from MRR by reading each page only once.
Overall, effective index design requires understanding query patterns, data distribution, and storage engine internals to avoid costly back‑table lookups, unnecessary page splits, and excessive I/O.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
