How InnoDB Locates the First Record in a B‑Tree Scan Range
This article explains why locating a record in a B‑Tree index is crucial for MySQL InnoDB operations, describes scan intervals, index page structures, and details the step‑by‑step binary and sequential search process used to find the first matching record, including performance optimizations.
For SQL execution, locating a record within a B‑Tree index is a critical capability.
InnoDB stores data organized by indexes, so update, delete, and query operations must first locate the specific record in the index; insert operations must also determine the correct position in the index.
When a WHERE clause can use an index, the engine first finds the first record of the scan interval defined by the clause, then reads subsequent records via the single‑direction chain within an index page and the double‑direction chain between pages.
This article is the first in a MySQL 8 series about the query optimizer and aims to lay the foundation for understanding subsequent topics.
The next article will estimate the number of records in the WHERE scan interval based on the concepts introduced here.
Content is based on MySQL 8.0.29 source code.
Table of Contents
1. Overview
2. What Is a Scan Interval?
3. Index Page Structure
4. Locating the First Record of the Scan Interval
4.1 Abstract Process Description
4.2 Preparing a B+ Tree
4.3 Which Leaf Node Holds the Record?
4.4 Position of the Record Within the Leaf Node
5. Performance Optimizations
5.1 Root and Inner Node Optimizations
5.2 Leaf Node Optimizations
6. Summary
1. Overview
Update, delete, and query operations locate a record in the index; insert operations find the insertion point, and all these processes are implemented in the same source method.
This article assumes the WHERE clause can hit an index and introduces how query operations locate the first record of the scan interval.
The locating process involves binary search and sequential search, which touch parts of the index page structure.
2. What Is a Scan Interval?
A scan interval is the range defined by the fields and relational operators (>, >=, <, <=, =) in a WHERE clause.
Scan intervals can be classified by: Boundedness: bounded interval, single‑sided bounded interval. Open/Closed: open, closed, half‑open/half‑closed intervals. Special: single‑point interval.
Bounded intervals Open interval: e.g., WHERE a > 100 AND a < 200 → (100, 200). Closed interval: e.g., WHERE a >= 100 AND a <= 200 → [100, 200]. Left‑open right‑closed: e.g., WHERE a > 100 AND a <= 200 → (100, 200]. Left‑closed right‑open: e.g., WHERE a >= 100 AND a < 200 → [100, 200).
Single‑sided bounded intervals Lower‑bounded left‑open: WHERE a > 100 → (100, +∞). Lower‑bounded left‑closed: WHERE a >= 100 → [100, +∞). Upper‑bounded right‑open: WHERE a < 200 → (‑∞, 200). Upper‑bounded right‑closed: WHERE a <= 200 → (‑∞, 200].
Single‑point interval Single value: e.g., WHERE a = 100 → [100, 100].
3. Index Page Structure
Root, inner, and leaf nodes of a B‑Tree index are all index pages.
The internal structure of an index page is complex; this article focuses on the parts needed for locating records: pseudo‑records, record chains, and slots (SLOT).
Record chain
Each record header contains a 2‑byte field storing the offset of the next record within the same page.
The offset is the address of the first byte of the record’s data (excluding the header) minus the address of the first byte of the page.
InnoDB index pages can be up to 64 KB, so 2 bytes can represent any byte offset within the page.
This 2‑byte field is called next_record and forms a singly linked list of records on the page.
Starting from any record and following next_record reaches the last record on the page.
Pseudo‑record
A pseudo‑record is a record inserted by InnoDB internally, not by the user.
Every index page contains two pseudo‑records regardless of user records: infimum: the first record on the page. If user records exist, its next_record points to the first user record; otherwise it points to supremum. supremum: the last record on the page.
Slot (SLOT)
Slots are of three types: infimum slot: contains only the infimum pseudo‑record. supremum slot: contains 1‑8 records, the last being the supremum pseudo‑record, the others are user records. ordinary slot: contains 4‑8 user records.
Each slot occupies 2 bytes storing the offset of the largest record in that slot.
The largest record is the last record in the slot when records are sorted by the index key in ascending order.
Slots are stored in the page directory area of the index page, ordered in reverse and packed tightly, with the first slot at the end of the page.
4. Locating the First Record of the Scan Interval
4.1 Abstract Process Description
A B+‑Tree index consists of root, inner, and leaf nodes. In a three‑level B+‑Tree, locating the first record of a scan interval follows these steps:
Start at the root node to determine which inner node contains the record.
Enter the inner node to determine which leaf node contains the record.
Enter the leaf node to determine the exact position of the record.
When the tree has more or fewer levels, the number of steps adjusts accordingly.
Each step performs a binary search followed by a sequential search.
If the current node is a root or inner node, the process continues to the next level; if it is a leaf node, the search ends.
4.2 Preparing a B+ Tree
Consider a primary key index on an int id column, forming a two‑level B+‑Tree (root and leaf). The example will locate the first record satisfying id >= 700, i.e., the interval [700, +∞).
4.3 Which Leaf Node Holds the Record?
Starting from the root, binary search determines the slot that may contain the first record of [700, +∞). The root has 8 slots; initial bounds are low = 0, high = 7.
Binary search rounds
Round 1: mid = (0+7)/2 = 3. Slot 3’s max id = 41 < low 700, so the target lies in the high side; set low = 3.
Round 2: mid = (3+7)/2 = 5. Slot 5’s max id = 81 < low 700 → target in high side; set low = 5.
Round 3: mid = (5+7)/2 = 6. Slot 6’s max id = 901 > low 700 → target in low side; set high = 6.
Now high - low = 1, binary search stops. The first record lies in slot 6 (the high slot).
Sequential search
Start from the record after the last record of slot 5 (id 81). Read id 101 (still < 700), then id 888 (> 700). The first record satisfying id >= 700 is found before id 888, i.e., in the leaf page containing id 101.
4.4 Position of the Record Within the Leaf Node
The leaf node has 10 slots; binary search bounds start at low = 0, high = 9.
Round 1: mid = 4. Slot 4’s max id = 404 < 700 → target in high side; set low = 4.
Round 2: mid = 6. Slot 6’s max id = 606 < 700 → target in high side; set low = 6.
Round 3: mid = 7. Slot 7’s max id = 707 > 700 → target in low side; set high = 7. Now high - low = 1, binary search ends.
Sequential search starts from the record after slot 6’s max (id 606). It reads id 666 (< 700), then id 688 (< 700), then id 700 (= 700) and stops, having found the first matching record.
5. Performance Optimizations
InnoDB reduces the number of field comparisons during binary and sequential searches by skipping already‑compared equal fields or bytes.
5.1 Root and Inner Node Optimizations
When a range’s left endpoint is known to lie within a set of slots whose maximum records share the same field values, InnoDB can omit re‑comparing those fields in subsequent rounds.
By intersecting the ranges identified in each binary‑search round, the engine narrows the candidate slots and avoids redundant comparisons.
5.2 Leaf Node Optimizations
For leaf nodes, InnoDB can also skip byte‑wise comparisons of fields such as int, blob, and internal row identifiers once it knows the relevant bytes are equal across the candidate slots.
These fields are compared byte‑by‑byte in the source code; skipping equal bytes reduces work.
Thus, after locking a range (e.g., slots 6‑9) where both i1 and the first 1000 bytes of b1 are identical, the engine starts comparisons from byte 1001 of b1, further cutting unnecessary work.
6. Summary
The article first defined scan intervals and illustrated each type, then introduced the index‑page structures most relevant to locating records: record chains, pseudo‑records, and slots.
It described the abstract process of binary search to locate the appropriate slot and sequential search to find the exact record, and walked through a concrete two‑level B‑Tree example step by step.
Finally, it explained InnoDB’s optimizations that reduce comparison counts: for root and inner nodes, equal fields are skipped; for leaf nodes, equal fields and equal leading bytes of variable‑length fields are also skipped.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
