Why MySQL Indexes Fail: B+ Tree Mechanics and Common Pitfalls
This article explains how InnoDB stores data pages and builds B+‑tree indexes, illustrates the structure of clustered, secondary and composite indexes, and enumerates typical scenarios—such as left‑most prefix violations, LIKE patterns, and function use—that cause index loss and full‑table scans.
Interviewer: I see you know database indexes. What are common cases where an index becomes ineffective?
Candidate: ... (conversation continues)
Preface
This article examines index invalidation from the perspective of InnoDB's B+‑tree and data pages. For deeper details on data pages and index structures, refer to the author’s earlier posts.
Outline
We first present the outline so readers know what will be covered.
What Is Index Inefficiency?
Creating an index aims to speed up queries, but an index may be unused if the optimizer chooses a full‑table scan despite the index existing.
Viewing B+‑Tree from Data Pages
(1) In leaf nodes, all records are ordered by primary key and linked as a doubly‑linked list. Each leaf key points to a record.
(2) Non‑leaf nodes store the smallest key of each child leaf, forming another doubly‑linked list.
Understanding data pages, their storage, and B+‑tree construction is essential before discussing index failure.
Data Page Structure
MySQL reads data by 16 KB pages, not individual rows. Pages are linked via pointers in the file header, forming a doubly‑linked list.
Each page contains seven parts, including user records, Infimum and Supremum pseudo‑records, and a page header.
How does MySQL know where the next page is?
The file header stores pointers to the previous and next pages, forming a linked list.
Data Page User Records
Key fields in a record header include:
delete_flag: 0 = not deleted, 1 = deleted
n_owned: number of records in the same group
record_type: 0 = ordinary, 1 = non‑leaf, 2 = Infimum, 3 = Supremum
next_record: offset to the next record’s header
Records are stored in a single‑linked list and grouped into slots. The first slot holds one record; the last slot holds 1‑8 records; intermediate slots hold 4‑8 records. Slots point to the last record of each group.
Building B+‑Tree Indexes on Data Pages
To simplify illustration, the underlying structure remains unchanged.
How does MySQL optimize the long chain of pages?
When many pages exist, InnoDB records each page’s number and its smallest primary key in a higher‑level index page (a directory entry). If an index page overflows, another level of index pages is created, forming the classic B+‑tree.
B+‑Tree Query Process
Querying proceeds in two stages: locating the correct page, then locating the record within the page.
Start at the top non‑leaf node (e.g., page 10). Use binary search on the key range to descend to the appropriate child page.
Repeat until reaching a leaf page.
Within the leaf page, binary‑search the slot directory to find the correct slot, then scan the records in that slot to locate the target row.
This two‑level binary search dramatically reduces I/O.
Index B+‑Tree Structures
Different index types produce different B+‑tree shapes.
CREATE TABLE `test_index` (
`id` INT NOT NULL AUTO_INCREMENT,
`col1` VARCHAR(45) NULL,
`col2` VARCHAR(45) NULL,
`crate_time` DATETIME NULL,
PRIMARY KEY (`id`)
) COMMENT='';Clustered Index
Leaf nodes store the full table rows; the tree is ordered by the primary key.
Secondary Index
Leaf nodes store the indexed column plus the primary key. The index page stores the column value and the page number.
Composite Index
Indexes on multiple columns are ordered first by the leftmost column, then by the next column.
Why Indexes Fail
Understanding B+‑tree mechanics clarifies many failure cases.
Left‑most Prefix Rule
SELECT * FROM test_index WHERE col1='a' AND col2='bb';Both columns can use the index because the query respects the left‑most prefix.
SELECT * FROM test_index WHERE col2='bb';
SELECT * FROM test_index WHERE col1 > 'a' AND col2='bb';In the first query, MySQL cannot use the index because col1 is not constrained.
In the second query, only col1 can use the index; col2 cannot because the left‑most prefix is broken.
LIKE Pattern Matching
Patterns with a leading % or surrounding % prevent index usage because the string order cannot be leveraged.
% at the left matches suffixes, which have no order.
Two % signs mean only the first character is indexed.
Other Common Pitfalls
Applying functions or expressions to indexed columns.
Implicit type conversion on indexed columns.
Using OR in the WHERE clause.
…and many more.
IS NULL / IS NOT NULL
Does IS NULL cause index loss?
MySQL can use indexes for IS NULL and IS NOT NULL just like equality checks; they do not inherently cause full‑table scans.
MySQL can perform the same optimization on col_name IS NULL that it can use for col_name = constant_value.
Indexes and ranges can be used to search for NULL with IS NULL.Index failure usually stems from large “back‑table” (回表) ranges, not from the null checks themselves.
Summary
InnoDB reads and writes data by 16 KB pages linked in a doubly‑linked list. B+‑tree indexes, built on these pages, accelerate lookups; each node is a page, and non‑leaf pages act as directory entries.
Page directories store slots that index grouped records, enabling binary search within a page.
Clustered, secondary, and composite indexes have distinct leaf structures, and understanding these differences helps avoid index loss, use index covering, and minimize back‑table lookups.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
