Can a Single InnoDB B+ Tree Store 20 Million Rows? The Full Calculation
This article explains how InnoDB’s B+‑tree index stores data, calculates the maximum number of rows a single tree can hold (around 20 million), shows how to determine the tree height from the tablespace file, and answers why MySQL uses B+ trees instead of other structures.
1. How many rows can an InnoDB B+ tree store?
InnoDB’s B+ tree can hold roughly 20 million rows. To understand why, we first review InnoDB’s index data structure and storage organization.
Data on disk is stored in the smallest unit called a sector (512 bytes). A filesystem groups sectors into blocks (typically 4 KB). InnoDB introduces its own smallest unit – a page – which is 16 KB.
2. Visualizing the smallest storage units
File systems allocate a full block (4 KB) even for a 1‑byte file. InnoDB data files (the *.ibd files) are always multiples of 16 KB.
Each InnoDB page can store data rows or key‑pointer entries. Leaf nodes store the actual rows; non‑leaf nodes store keys and pointers.
3. Estimating rows per page
If a row is about 1 KB, a 16 KB page can hold 16 rows (16 KB / 1 KB).
Non‑leaf nodes store a primary‑key (8 bytes) plus a pointer (6 bytes) = 14 bytes per entry. A page can therefore hold 16384 / 14 ≈ 1170 pointers.
For a B+ tree of height 2 (root + leaf), the maximum rows are 1170 × 16 = 18 720.
For height 3, the capacity becomes 1170 × 1170 × 16 = 21 902 400 rows.
In practice InnoDB B+ trees have heights of 1–3, easily supporting tens of millions of rows with only 1–3 I/O operations per primary‑key lookup.
4. Determining the B+‑tree height
The root page of a primary‑key index is always page number 3. At offset 64 within that page the page level is stored. Tree height = page level + 1.
Using hexdump on the tablespace file at offset 16384 × 3 + 64 (49216) reveals the page level for each table.
Example results:
lineitem: page level 2 → height 3
region: page level 0 → height 1
customer: page level 2 → height 3
5. Summary
Both lineitem (≈6 million rows) and customer (≈150 k rows) have a B+‑tree height of 3, showing that even with large data size the lookup cost remains about three I/O operations. A table with only a few rows (e.g., region with 5 rows) has height 1.
6. Interview question: Why does MySQL use B+ trees for indexes?
Because B‑trees store data in both leaf and internal nodes, reducing the fan‑out and increasing tree height, which leads to more I/O. B+ trees keep data only in leaf nodes, allowing a higher fan‑out, shallower trees, and fewer I/O operations.
7. Final remarks
The article walks through the principle of InnoDB’s index‑organized tables, how queries traverse the B+ tree, and how to verify the tree height in practice. It also notes that secondary indexes rely on the primary‑key index for row retrieval.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
