Why MySQL InnoDB Chooses B+ Trees: A Deep Dive into Index Design
This article explains why MySQL's default InnoDB storage engine uses B+ trees instead of B trees or hash indexes, covering storage engine basics, query performance, I/O considerations, and the structural advantages that make B+ trees ideal for OLTP workloads.
Overview
MySQL itself does not directly use B+ trees; the default storage engine InnoDB does. InnoDB stores both primary and secondary indexes with B+ trees, while other engines such as MyISAM or MEMORY are also available.
When creating a table without specifying an engine, InnoDB is used by default:
CREATE TABLE t1 (
a INT,
b CHAR(20),
PRIMARY KEY (a)
) ENGINE=InnoDB;For a deeper understanding of InnoDB’s storage format, see earlier articles on InnoDB pages, locks, and indexes.
In InnoDB, primary keys are stored as <id, row> pairs, while secondary indexes store <index, id> pairs, allowing the row to be fetched via the primary key.
Primary index: the id is the primary key; looking up the id returns the full row.
Secondary index: the index columns form the key; the index stores the id, which can then be used to fetch the full row.
Design Considerations
Two main reasons drive InnoDB’s choice of B+ trees:
Support for OLTP query patterns that require strong performance on a wide range of operations.
CPU‑to‑disk data loading costs, where minimizing random I/O is crucial.
Read/Write Performance
OLTP workloads involve INSERT, UPDATE, DELETE, and SELECT statements. Using a B+ tree gives an O(log n) time complexity for single‑row operations, which is acceptable because the tree height is small. Hash indexes can achieve O(1) for single‑row lookups but perform poorly for range queries and sorting, often forcing a full table scan.
SELECT * FROM posts WHERE author = 'draven' ORDER BY created_at DESC;
SELECT * FROM posts WHERE comments_count > 10;
UPDATE posts SET github = 'github.com/draven' WHERE author = 'draven';
DELETE FROM posts WHERE author = 'draven';When a hash index is used for the above queries, the database cannot efficiently satisfy the ORDER BY or range condition, leading to costly full‑table scans.
Data Loading
Operating systems load data from disk to memory in pages, typically 4 KB in size. The command below shows how to query the page size on a Unix‑like system:
$ getconf PAGE_SIZE
4096On macOS the page size is also 4 KB, though it may differ on other platforms.
Random I/O to load a page can take around 10 ms, while sequential reads can reach 40 MB/s. Reducing random I/O therefore has a large impact on query latency.
B+ trees store all data rows in leaf nodes, which are linked together. This ordered, sequential layout allows range scans and ordered queries to be satisfied by walking the leaf list, avoiding repeated random page loads.
In contrast, a B tree stores data in internal nodes as well, requiring additional random I/O to locate each qualifying row.
Load the root page and follow the pointer to the left child to find values greater than 4.
Load the left child page and locate the value 5.
Reload the root page to discover the right subtree does not contain the second element.
Load the right child page and retrieve values 7 and 8.
Because B+ tree leaves are linked, the same search can be performed by a single sequential walk after reaching the leftmost qualifying leaf.
The height of a B+ tree storing millions of rows is typically 3–5, so the extra O(log n) cost is negligible.
Conclusion
InnoDB’s B+‑tree index strikes a balance:
Hash indexes give O(1) for single‑row operations but cannot handle range queries or sorting without full scans.
B trees store data in internal nodes, causing extra random I/O for ordered scans.
B+ trees keep all data in ordered leaf nodes linked together, minimizing random I/O and supporting efficient range and ordered queries, which matches typical OLTP workloads.
While more exotic designs (e.g., hybrid B+‑tree + hash structures) can offer marginal gains, they increase complexity and maintenance cost. The enduring use of B+ trees in InnoDB demonstrates the lasting value of a well‑matched engineering decision.
Reference
B+ tree – Wikipedia
What is the difference between MySQL InnoDB B+ tree index and hash index? Why does MongoDB use B‑tree?
B+Trees and why I love them, part I
Main differences between InnoDB and MyISAM
B+ Tree File Organization
Database Index: A Re‑visit to B+ Tree
Fundamentals of Database Systems
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
