Understanding Index Height and Page I/O in MySQL InnoDB
This article explains how the height of MySQL InnoDB B+‑tree indexes determines the number of page I/O operations for different query types, provides theoretical calculations of index height, and shows practical methods using information_schema and hexdump to inspect the actual index height of tables.
Problem
Many developers wonder how many page I/O operations a single MySQL query triggers, given that page I/O heavily influences query performance.
Analysis
MySQL implements indexes with B+‑tree structures. The leaf nodes store the actual rows (clustered index stores full rows, secondary index stores index value + primary key). Accessing a leaf node requires reading a number of pages equal to the index height h. For a point query on the primary key, the number of page I/O operations equals the clustered index height h1. For secondary indexes, the I/O count depends on whether the index is covering ( h2) or requires a table lookup ( h2 + h1).
Typical cases:
Point query – clustered index: h1 Point query – secondary index: covering h2, non‑covering h2+h1 Range query – similar to point query but may read additional leaf pages.
Full‑table scan – traverses all leaf nodes via linked list.
Theoretical calculation of index height
Assuming a fan‑out factor k for internal nodes and n rows per leaf node, the number of leaf nodes is k^(h‑1) and total rows k^(h‑1) * n. In InnoDB a 16 KB page stores a 4 B primary key, a 4 B page number, and 6 B overhead, giving k ≈ 1170. With 1 KB rows, n = 16. Height h = 3 can index about 2.19 million rows; h = 4 can index roughly 25.6 billion rows.
How to view the real index height
The PAGE_LEVEL field in each InnoDB page indicates its level in the index tree; the root page’s level plus one equals the index height. The root page number can be obtained from information_schema.INNODB_SYS_INDEXES and information_schema.INNODB_SYS_TABLES. Example query:
SELECT b.name, a.name, index_id, type, a.space, a.PAGE_NO
FROM information_schema.INNODB_SYS_INDEXES a,
information_schema.INNODB_SYS_TABLES b
WHERE a.table_id = b.table_id
AND a.space <> 0;Using the returned PAGE_NO, run hexdump -C -s 16384*PAGE_NO+64 -n 10 user.ibd; the first two bytes (plus one) give the index height.
Verification example
A user table is created with several indexes. Data is inserted with varying row sizes, and hexdump is used to read the PAGE_LEVEL of the clustered index, a unique secondary index, and a name index. The observed heights for different row counts are tabulated, showing that the clustered index height reaches 4 only after about 27 million rows, slightly higher than the theoretical 21.9 million due to smaller actual row sizes.
The name index reaches height 4 earlier (≈23 million rows) because its key length (32 B) yields a smaller fan‑out factor ( k ≈ 390), making the tree “tall and thin”.
Summary
Page I/O for a query is positively correlated with index height h; point queries use h1 for clustered indexes and h2 or h2+h1 for secondary indexes.
Typical index heights are 2–4 levels; with h = 3, int primary key and 1 KB rows, about 21.9 million rows can be indexed, which explains why many large systems use 20 million rows as a sharding threshold.
Real‑world height can be obtained via information_schema and hexdump as described.
Index height also depends on column data types: narrower keys (e.g., VARCHAR(32)) reduce fan‑out, leading to “tall and thin” indexes with higher I/O cost.
Quick commands for checking index heights:
# Clustered index (first)
hexdump -C -s 49216 -n 10 user.ibd
# Second index
hexdump -C -s 65600 -n 10 user.ibd
# Third index
hexdump -C -s 81984 -n 10 user.ibdSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
