Why MySQL Indexes Matter: Understanding B+Tree vs B-Tree
This article explains MySQL index fundamentals, compares B+Tree and B-Tree structures, shows how page size and tree height affect capacity, outlines which query patterns benefit from indexes, and discusses practical limits and the adaptive hash index feature.
1. What Is an Index?
An index works like a dictionary: it lets the database jump directly to the page containing the desired record instead of scanning every row, dramatically improving query speed, especially on large tables. Indexes (also called keys) may seem irrelevant on tiny datasets, but they can boost performance by orders of magnitude on big ones. The optimization principles apply to both HDD and SSD storage.
2. Index Data Structures
2.1 B+Tree and B-Tree
In MySQL, the storage engine implements indexes. InnoDB, the most common engine, uses a B+Tree structure. The following example table shows sample data (username, age, address, gender) and a composite index on username and age. The resulting on‑disk structure is a B+Tree, illustrated by the two diagrams below.
Both diagrams depict a multi‑way balanced search tree (not a binary tree).
Green squares are pointers to child nodes; red squares point to leaf nodes (absent in B‑Tree). Shaded rectangles hold the indexed key values.
In a B+Tree, non‑leaf nodes store only keys and child pointers; all actual row data reside in leaf nodes, giving a stable number of I/O operations per lookup.
In a B‑Tree, non‑leaf nodes also store pointers to the actual rows, so a search may stop at an internal node, leading to variable I/O cost.
Because B+Tree non‑leaf nodes store fewer bytes, they can point to more children, reducing tree height and I/O.
Leaf nodes are ordered; each leaf contains a pointer to the next leaf, enabling efficient range scans.
Leaf ordering also aids sorting; B‑Tree leaves are not fully ordered, making sorting slower.
B+Tree uses a left‑closed, right‑open interval for key placement (e.g., (af,88) splits left < (af,88) and right ≥ (af,88)). B‑Tree uses left‑open, right‑open.
Full‑table scans are faster with B+Tree because all rows appear in leaf nodes linked together.
Leaf pointers reference the full row in a clustered index or the primary key in a secondary index; the latter requires a “lookup” (back‑track) to fetch the row.
2.2 Tree Height Problem
InnoDB pages are 16 KB (default). Each page can hold 1024 records of 16 bytes, so a logical page holds 1024 index entries. 16384/16=1024 Assuming a 1 KB row size, a leaf page stores 16 rows. A non‑leaf entry consists of an 8‑byte BIGINT primary key plus a 6‑byte pointer, giving roughly 1170 children per non‑leaf node: 16*1024/(8+6)=1170 A three‑level B+Tree can therefore store about 21.9 million rows: 1170*1170*16=21902400 In practice, InnoDB B+Trees are 2–4 levels deep, meaning most queries need only 2–4 page reads.
2.3 Which Queries Can Use an Index?
Full‑value match (e.g., username='ac' AND age=98).
Left‑most prefix match (e.g., search by username only, because it is the first column in the composite index).
Prefix match (e.g., usernames starting with “a”).
Range match (e.g., usernames between ‘ab’ and ‘cc’).
Left‑full + right‑range (e.g., username='bw' AND age BETWEEN 90 AND 99).
Covering index (the needed columns are stored entirely in the index, so the table row is never accessed).
2.4 Usage Limitations
Searching by age alone cannot use the composite index because age is not ordered within the tree.
Skipping the first column of a composite index renders the remaining columns unusable for index lookups.
When a range condition appears on the leftmost column, columns to its right cannot be used for direct index navigation (they may be filtered after the index scan). Example: WHERE username LIKE 'a%' AND age=99 Here age=99 cannot be used for index navigation; it is applied as a filter after the username range scan.
2.5 Adaptive Hash Index
InnoDB does not support native hash indexes, but it can automatically build an adaptive hash index on top of a B+Tree when a particular key is accessed frequently. This feature is enabled by default; you can verify it with the appropriate SQL command (illustrated in the screenshot).
3. Summary
Indexes reduce the amount of data the server must scan.
Indexes help avoid sorting and temporary table creation.
Indexes turn random I/O into sequential I/O, improving performance.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
