Understanding MySQL Indexes: Models, Maintenance, Utilization, and Optimization
This article explains MySQL index fundamentals—including hash tables, ordered arrays, and B+‑tree structures—covers index maintenance, demonstrates how indexes improve query execution, discusses best practices such as left‑most prefix, covering indexes, index push‑down, and unique indexes, and provides practical tips for index selection, avoiding index loss, and using EXPLAIN for performance tuning.
When working with MySQL, indexes are essential for speeding up data retrieval, much like a book's table of contents. The article begins by introducing three common index models: hash tables, ordered arrays, and the B+‑tree, which is the default structure used by InnoDB.
The B+‑tree derives from a balanced binary search tree but expands the branching factor to reduce disk seeks; a typical InnoDB leaf node can hold around 1,200 entries, allowing a tree height of four for billions of rows, which dramatically cuts the number of disk accesses.
Index maintenance is described next. Each InnoDB index corresponds to a B+‑tree. Primary (clustered) indexes store the full row in leaf nodes, while secondary indexes store only the primary key. When inserting data, pages may need to be split, and updates can be buffered in the change buffer for non‑unique indexes, reducing random I/O.
Using indexes in queries is illustrated with a step‑by‑step execution flow. For example, the query SELECT * FROM T WHERE k BETWEEN 1 AND 3; first searches the secondary index for k=1 , retrieves the primary key, then fetches the full row from the primary index, repeating until the range ends. If the query selects only indexed columns, it becomes a covering index and avoids the extra lookup.
Advanced concepts such as the left‑most prefix rule, index push‑down (filtering rows during index traversal), and the behavior of unique indexes are explained. Unique indexes stop scanning after the first match and cannot use the change buffer, which may affect write‑heavy workloads.
Practical index selection advice is given: create indexes on frequently queried, filtered, or sorted columns; prefer composite indexes that match query patterns (e.g., (city, name) for WHERE city='杭州' ORDER BY name ) to eliminate filesort and enable covering indexes. The article also warns about common causes of index loss, such as using functions or type conversions on indexed columns, leading to full table scans.
Examples of problematic queries are shown, like SELECT COUNT(*) FROM T WHERE MONTH(modified) = 6; and SELECT * FROM T WHERE age = 20; where function calls or implicit casts prevent index usage. Encoding mismatches (utf8 vs utf8mb4) can also cause implicit conversions that invalidate indexes.
Additional topics include the differences between B‑trees and B+‑trees, the purpose of the EXPLAIN statement for viewing execution plans, and why a table should not have too many indexes (typically no more than five) to avoid optimizer overhead.
The article concludes with a brief Q&A and references to related reading material.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.