Databases 10 min read

Can a Single InnoDB B+ Tree Store 20 Million Rows? The Full Calculation

This article explains how InnoDB’s B+‑tree index stores data, calculates the maximum number of rows a single tree can hold (around 20 million), shows how to determine the tree height from the tablespace file, and answers why MySQL uses B+ trees instead of other structures.

Java Backend Technology

May 30, 2019

Can a Single InnoDB B+ Tree Store 20 Million Rows? The Full Calculation

1. How many rows can an InnoDB B+ tree store?

InnoDB’s B+ tree can hold roughly 20 million rows. To understand why, we first review InnoDB’s index data structure and storage organization.

Data on disk is stored in the smallest unit called a sector (512 bytes). A filesystem groups sectors into blocks (typically 4 KB). InnoDB introduces its own smallest unit – a page – which is 16 KB.

2. Visualizing the smallest storage units

File systems allocate a full block (4 KB) even for a 1‑byte file. InnoDB data files (the *.ibd files) are always multiples of 16 KB.

Each InnoDB page can store data rows or key‑pointer entries. Leaf nodes store the actual rows; non‑leaf nodes store keys and pointers.

3. Estimating rows per page

If a row is about 1 KB, a 16 KB page can hold 16 rows (16 KB / 1 KB).

Non‑leaf nodes store a primary‑key (8 bytes) plus a pointer (6 bytes) = 14 bytes per entry. A page can therefore hold 16384 / 14 ≈ 1170 pointers.

For a B+ tree of height 2 (root + leaf), the maximum rows are 1170 × 16 = 18 720.

For height 3, the capacity becomes 1170 × 1170 × 16 = 21 902 400 rows.

In practice InnoDB B+ trees have heights of 1–3, easily supporting tens of millions of rows with only 1–3 I/O operations per primary‑key lookup.

4. Determining the B+‑tree height

The root page of a primary‑key index is always page number 3. At offset 64 within that page the page level is stored. Tree height = page level + 1.

Using hexdump on the tablespace file at offset 16384 × 3 + 64 (49216) reveals the page level for each table.

Example results:

lineitem: page level 2 → height 3

region: page level 0 → height 1

customer: page level 2 → height 3

5. Summary

Both lineitem (≈6 million rows) and customer (≈150 k rows) have a B+‑tree height of 3, showing that even with large data size the lookup cost remains about three I/O operations. A table with only a few rows (e.g., region with 5 rows) has height 1.

6. Interview question: Why does MySQL use B+ trees for indexes?

Because B‑trees store data in both leaf and internal nodes, reducing the fan‑out and increasing tree height, which leads to more I/O. B+ trees keep data only in leaf nodes, allowing a higher fan‑out, shallower trees, and fewer I/O operations.

7. Final remarks

The article walks through the principle of InnoDB’s index‑organized tables, how queries traverse the B+ tree, and how to verify the tree height in practice. It also notes that secondary indexes rely on the primary‑key index for row retrieval.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Storage Engine InnoDB mysql B+Tree Database Index

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.