Why MySQL Chooses B+ Trees Over Other Tree Structures: A Deep Dive
This article explains the evolution from binary search trees to AVL and red‑black trees, introduces B and B+ trees, compares their properties and performance, and clarifies why B+ trees are the preferred indexing structure for MySQL databases.
1. Binary Search Tree
(1) Introduction: A binary search tree (BST) is an ordered binary tree where each node's left subtree contains smaller keys and the right subtree contains larger keys, with no duplicate keys.
(2) Limitations: An unbalanced BST can degenerate into a linear chain, leading to poor search performance; balanced trees such as AVL are needed.
2. AVL Tree
(1) Introduction: AVL trees are height‑balanced BSTs where the height difference between left and right sub‑trees of any node does not exceed 1. Rotations are used to maintain balance, making AVL suitable for read‑heavy workloads with few inserts/deletes.
(2) Limitations: Maintaining strict balance incurs higher rotation cost, so AVL is less common than red‑black trees in practice.
(3) Applications: Used in Windows NT kernel.
3. Red‑Black Tree
(1) Introduction: A binary search tree where each node has a color (red or black) and follows specific coloring rules that guarantee the longest path is at most twice the shortest, providing a weakly balanced structure.
(2) Properties: Every node is red or black; root is black; leaves (NULL) are black; red nodes have black children; every path from a node to its descendant leaves contains the same number of black nodes.
(3) Applications: Implementations of C++ STL map/set, Linux CFS scheduler, epoll, Nginx timer management, Java TreeMap.
4. B / B+ Tree
(1) Introduction: B‑trees are multi‑way balanced trees designed for disk storage, reducing I/O by increasing the branching factor and keeping all leaf nodes at the same depth.
(2) B‑Tree Properties: Each internal node (except root) has between ⌈M/2⌉ and M children; nodes store M‑1 keys; all leaves are on the same level.
(3) B+ Tree Introduction: A variant where only leaf nodes store actual data while internal nodes store only keys as indexes, forming a dense index with a linked list of leaves.
(4) B+ Tree Properties: Internal node pointers equal the number of keys; leaf nodes contain all keys in order; leaves are linked for sequential access; suitable for file systems and range queries.
(5) Applications: Used as the primary index structure in file systems and databases such as MySQL.
5. B / B+ Tree Performance Analysis
For n nodes, a balanced binary tree has height O(log n), while a B/B+ tree has height O(log_t((n+1)/2)) where t is the branching factor. In memory, B‑trees may not outperform binary trees unless the branching factor is small; however, on disk the reduced height greatly lowers I/O.
6. Why B+ Trees Are Better for Database Indexes
1) Lower disk I/O cost: Internal nodes are smaller, allowing more keys per disk block, reducing the number of reads.
2) Stable query performance: All keys require the same root‑to‑leaf path length.
3) Efficient range scans: Data resides only in leaves, enabling sequential leaf traversal without extra in‑order scans.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
