How MySQL Indexes and B+ Trees Speed Up Queries: From Pages to Disk I/O
This article explains the fundamentals of MySQL indexing, detailing how InnoDB pages, B+‑tree structures, clustered and non‑clustered indexes, and disk prefetching work together to reduce I/O operations and dramatically improve query performance.
Indexes are essential for every engineer; understanding their principles is crucial for writing high‑quality SQL. This article walks through the concepts from basic table creation to the inner workings of InnoDB pages and B+‑tree indexes.
Practical Example
Consider a simple user table:
CREATE TABLE `user` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` int(11) DEFAULT NULL COMMENT '姓名',
`age` tinyint(3) unsigned DEFAULT NULL COMMENT '年龄',
`height` int(11) DEFAULT NULL COMMENT '身高',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='用户表';Typical queries on this table include selections by primary key, ordering, and filtering by age:
1. SELECT * FROM user WHERE id = xxx;
2. SELECT * FROM user ORDER BY id ASC/DESC;
3. SELECT * FROM user WHERE age = xxx;
4. SELECT age FROM user WHERE age = xxx;
5. SELECT age FROM user ORDER BY age ASC/DESC;After inserting sample rows (the id column is auto‑incremented), InnoDB stores each record in a page. The auto‑incremented IDs are sequential, allowing the engine to link records in order.
When querying SELECT * FROM user WHERE id = 3, InnoDB reads pages sequentially from the smallest ID, comparing each record until it finds the target. Each page read from disk to memory counts as one I/O operation. Scanning many records can cause many I/Os, which is why reducing I/O is critical.
Page and Locality Principle
Reading a whole page (default 16 KB) instead of a single record leverages the program locality principle: data near the requested record is likely needed soon, so loading an entire page reduces the number of I/Os.
InnoDB reads 16 KB (approximately 100 rows) per I/O, turning a 100‑row scan into a single I/O, improving performance by roughly 100×.
Page Directory and B+‑Tree Evolution
To avoid scanning many pages, MySQL builds a page directory (a higher‑level index) that points to the first record of each page. Extending this idea across multiple pages yields a B+‑tree structure with three node types:
Root node (top‑level directory page)
Intermediate directory pages
Leaf pages that store the full records
Searching for id = 55 involves loading the root, then the appropriate directory page, and finally the leaf page, resulting in one I/O per tree level (typically 3–4 I/Os).
Clustered vs. Non‑Clustered Indexes
A primary‑key index in InnoDB is a clustered index: leaf nodes store the complete row data. Any defined primary key automatically becomes clustered.
Secondary (non‑clustered) indexes store only the indexed column(s) and the primary‑key value. To satisfy SELECT * using a secondary index, MySQL must perform a “back‑lookup” (回表) from the leaf node to the clustered index, which can cause random I/Os.
Index covering avoids this by selecting only columns that exist in the secondary index, eliminating the need for a back‑lookup.
SELECT age FROM user WHERE age = xxx; -- covered index
SELECT age, id FROM user WHERE age = xxx; -- also coveredDisk Prefetch and I/O Characteristics
Operating systems manage memory in 4 KB pages; InnoDB uses 16 KB pages (four OS pages). When a page is read, the OS can prefetch contiguous pages, turning a single 16 KB read into one sequential I/O.
Sequential disk I/O (reading adjacent sectors) is much faster than random I/O. InnoDB pages are stored contiguously, so reading a page is a sequential I/O operation.
Summary
Understanding how MySQL builds indexes—from simple page reads to multi‑level B+‑trees—clarifies why proper indexing dramatically reduces I/O and improves query speed. Clustered indexes store full rows, while secondary indexes require back‑lookups unless they are covering. Disk prefetch and sequential I/O further boost performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
