Mastering MySQL Indexes: From Hash Tables to B+ Trees and Beyond
This article explains the main index structures used in MySQL—including hash tables, ordered arrays, search trees, and B+ trees—covers why InnoDB prefers B+ trees, the benefits of auto‑increment primary keys, and key concepts such as covering indexes, the left‑most prefix rule, and index‑pushdown for query optimization.
1. Common Index Models
1. Hash Table
A key‑value storage structure that retrieves a value directly from a key. It can be viewed as an array where the key is hashed to a fixed position; collisions are handled with linked lists. Suitable only for equality queries; range queries require full scans.
2. Ordered Array
Data stored in increasing order of the key (e.g., ID numbers). Supports both equality and range queries efficiently using binary search, but updates are costly because inserting a record requires shifting subsequent entries. Best for static storage engines where data rarely changes.
3. Search Tree
The classic binary tree where left child < parent < right child. While binary trees offer high search efficiency, they are impractical for disk‑based databases because each level requires a random I/O operation. Therefore, N‑ary trees (e.g., B‑trees) are used to reduce the number of disk accesses.
2. InnoDB Index
1. Why Use B+ Tree Index Model?
B+ trees have a fixed height, store data only in leaf nodes, and link leaf nodes sequentially, making range queries very efficient. Compared to binary trees and red‑black trees, B+ trees provide predictable height and better disk‑friendly performance.
2. Why Prefer Auto‑Increment Primary Keys?
Auto‑increment IDs avoid page splits by always inserting at the end of the index, allowing continuous allocation of pages and improving space utilization and operation efficiency.
3. Three Important MySQL Index Concepts
Consider a table with two indexes: one on ID (primary) and one on k.
In InnoDB, the leaf nodes of the primary key index store the full row data, while non‑primary indexes store only the primary key.
Querying rows where k is between 3 and 5 involves locating matching IDs in the k index, then fetching the rows via the primary key index—a process known as "back‑lookup".
1. Covering Index
If a query selects only the indexed column (e.g., SELECT ID FROM T WHERE k BETWEEN 3 AND 5), the engine can satisfy the query using only the k index without back‑lookup, which is called a covering index and can significantly improve performance.
2. Left‑most Prefix Rule
Index entries are ordered by the sequence of indexed columns. Queries that specify the leftmost columns of a composite index can use the index efficiently, whether the condition is an exact match or a prefix like LIKE 'Zhang%'.
3. Index Pushdown
For a composite index on (name, age), a query such as
SELECT * FROM tuser WHERE name LIKE 'Zhang%' AND age=10 AND ismale=1can filter rows by age directly within the index (introduced in MySQL 5.6), reducing the number of back‑lookups and improving query speed.
Content compiled from Ding Qi’s "MySQL in Practice 45 Lectures".
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
