Databases 5 min read

Why MySQL Indexes Use B‑Tree, Not Hash: Understanding Index Data Structures

The article explains how MySQL indexes rely on ordered data structures—binary trees, red‑black trees, hash tables, and especially B‑Trees—to accelerate queries, compares their performance characteristics, illustrates how tree height affects I/O operations, and shows why B‑Tree is preferred for range searches.

Open Source Tech Hub

Apr 22, 2021

Why MySQL Indexes Use B‑Tree, Not Hash: Understanding Index Data Structures

1. Index Data Structure Overview

Indexes are ordered data structures that help MySQL retrieve rows efficiently.

Example table t with 7 rows; query select * from t where t.clo2 = 89.

If no index exists, MySQL performs a full table scan, reading each row and causing at least six disk I/O operations.

2. Common Index Structures

Binary Tree

Red‑Black Tree

Hash Table

B‑Tree

1. Binary Tree

Rule: left child < right child.

When the indexed column (e.g., Col1) is auto‑incrementing, inserts create a skewed tree similar to a linked list, increasing height.

In such a case, a lookup may need four node traversals.

In the worst case of sequential inserts, a binary‑tree index degenerates to a full scan.

2. Red‑Black Tree

Self‑balancing binary tree; rebalances when a branch exceeds three nodes.

Finding the value 6 requires six node visits in a plain binary tree but only three in a red‑black tree. MySQL does not use red‑black trees for indexes because the tree height can still become large with massive data sets.

With 5 million rows, a red‑black tree height may reach 23, leading to 23 disk I/O operations for a leaf‑node lookup, which is inefficient.

3. Hash Table

If an index were stored as a hash, MySQL would compute a hash value to locate the disk pointer directly, giving constant‑time lookups regardless of table size.

However, hash indexes cannot support range queries (e.g., select * from t where clo2 > 6) or ordering, and hash collisions, though rare, are possible.

Range queries and ORDER BY cannot use a hash index.

Hash collisions are handled internally but add complexity.

4. B‑Tree

All leaf nodes have the same depth; nodes store multiple sorted keys; pointers to child nodes are null at leaves; keys are unique.

With a maximum degree of 4, the tree shown requires only two node accesses to locate a row.

While B‑Tree lookups are fast, if a node’s page size (e.g., 16 KB) is filled with large row data, the number of keys per node drops, increasing tree height. Nevertheless, B‑Tree remains the preferred MySQL index because it supports both point and range queries efficiently.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql Data Structures index B+Tree

Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.