Databases 35 min read

Why MySQL Uses B+ Trees Over B Trees: Deep Dive into Index Structures

Explore MySQL's comprehensive index guide, covering basic concepts, B‑tree and B+‑tree structures, storage engine differences, index types, optimization strategies, and practical tips for designing effective primary, secondary, and composite indexes to boost query performance and reduce I/O overhead.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
Why MySQL Uses B+ Trees Over B Trees: Deep Dive into Index Structures
First a classic server interview question: If a query select * from table where a=1 group by b order by c has a single‑column index on each field, will the indexes be used? What about composite indexes? This article provides a complete overview of MySQL indexes, including fundamentals, B‑tree structure, index principles, and practical strategies.

1. Index Basics Review

What is an index

MySQL defines an index as a data structure that helps MySQL retrieve data efficiently; fundamentally, an index is a data structure.

The purpose of an index is to improve query efficiency, similar to a dictionary, a train schedule, or a book's table of contents.

An index can be understood as a sorted data structure that references the actual data, enabling advanced search algorithms.

Indexes are stored on disk as index files because they cannot all reside in memory.

Advantages

Indexes dramatically reduce the amount of data the server must scan, improving retrieval speed.

Indexes help avoid sorting and temporary tables, reducing CPU consumption.

Indexes turn random I/O into sequential I/O, lowering I/O cost.

Disadvantages

Indexes occupy space; they store the primary key and indexed columns and point to the data rows.

While indexes speed up reads, they slow down INSERT, UPDATE, and DELETE because the index files must also be maintained.

Index Classification

From three perspectives:

Logical perspective

Primary key index – a special unique index that cannot be NULL.

Ordinary (single‑column) index – each index contains a single column; a table can have many.

Composite (multi‑column) index – only used when the query references the leftmost indexed column.

Unique vs. non‑unique index.

Full‑text index – searches for keywords in text.

Spatial index – indexes spatial data types.

Data‑structure perspective

Hash index – uses a hash algorithm to map column values to fixed‑length hash values; suitable only for equality queries.

B+ tree index – discussed in detail later.

Physical storage perspective

Clustered index (clustered index).

Non‑clustered index (secondary index).

2. MySQL Index Structure

MySQL supports many index structures to suit different scenarios.

Indexes are implemented at the storage‑engine layer, not at the server layer. Not all storage engines support every index type, and implementations may differ.

Disk I/O

Disk I/O consists of seek time, rotational latency, and transfer time. For a typical 7200 RPM disk, rotational latency is about 4.17 ms, making a single I/O roughly 9 ms. Since a modern CPU can execute hundreds of thousands of instructions in that time, excessive I/O becomes a performance bottleneck.

Operating systems mitigate this by pre‑reading adjacent data into memory (pages). A page is usually 4 KB or 8 KB, so one I/O fetches many rows, which is crucial for index design.

Why B+ Tree

B+ trees reduce the height of the tree compared to B trees because only leaf nodes store actual row data, while internal nodes store only keys. This lowers the number of I/O operations needed for a lookup.

In InnoDB, leaf nodes are linked in a doubly‑linked list, allowing efficient range scans.

MyISAM vs. InnoDB Index Principles

MyISAM

MyISAM stores index files (.myi) separately from data files. Leaf nodes contain the physical address of the row, making it a non‑clustered index. The primary key is a unique, non‑NULL index named PRIMARY.

InnoDB

InnoDB's leaf nodes store the full row data for the primary key (clustered index). Secondary indexes store the primary key value as a pointer to the row. The data file (.ibd) is the primary index itself.

InnoDB can store each table in its own .ibd file when innodb_file_per_table is ON (default since 5.6.6).

3. Index Strategies

When to Create Indexes

Primary key automatically creates a unique index.

Columns frequently used in query predicates.

Foreign‑key columns used for joins.

High‑concurrency scenarios often benefit from composite indexes.

Columns used for ORDER BY.

Columns used for GROUP BY or aggregation.

When Not to Create Indexes

Very small tables.

Tables with heavy INSERT/UPDATE/DELETE activity.

Columns with low selectivity or uniform distribution.

Columns that are updated frequently.

Columns not used in WHERE clauses.

High‑Efficiency Indexes

Independent Columns

MySQL can use an index only if the column appears alone in the predicate; expressions or functions prevent index usage.

Prefix Indexes

Prefix indexes store only the first N characters of a column, saving space. They are required for very long TEXT or BLOB columns. The prefix length must balance selectivity and size.

Covering Indexes

A covering index contains all columns needed by the query, allowing MySQL to satisfy the query using only the index (no row lookup).

Multi‑Column (Composite) Indexes

Composite indexes are defined as CREATE INDEX idx ON tbl(col1, col2, col3). They are useful when queries filter on the leftmost columns. The “leftmost prefix” rule applies: (col1), (col1,col2), (col1,col2,col3) can be used, but not (col2) alone.

Leftmost Prefix Principle

Only conditions that include the leading column(s) of a composite index can use that index. For example, with an index on (name, age, sex), a query with WHERE name='Bob' uses the index, but WHERE age=19 does not.

Index Condition Pushdown

Since MySQL 5.6, the optimizer can evaluate additional predicates while scanning a secondary index, reducing the number of row lookups (index pushdown).

Using Index Scan for Sorting

If the index order matches the ORDER BY clause, MySQL can produce sorted results by scanning the index, avoiding an extra sort operation.

Compressed (Prefix‑Compressed) Indexes

MyISAM can compress index prefixes to reduce size, allowing more indexes to fit in memory.

Duplicate and Redundant Indexes

Duplicate indexes have the same columns in the same order; they should be removed. Redundant indexes are prefixes of existing indexes (e.g., an index on (a) is redundant if there is already an index on (a,b)).

Unused Indexes

Indexes that are never used waste space. Tools like Percona Toolkit’s pt‑index‑usage or the information_schema.index_statistics table can help identify them.

4. Index Optimization

Common Causes of Slow SQL

Hardware bottlenecks (slow network, insufficient memory, limited I/O throughput, full disks).

Missing or ineffective indexes.

Excessive data volume (consider sharding).

Suboptimal server configuration (my.cnf settings).

Optimization Techniques

Prefer full‑value matches (equality predicates).

Apply the leftmost prefix rule for composite indexes.

Avoid functions or type conversions on indexed columns.

Do not place range conditions on columns that appear after a range in the index.

Use covering indexes whenever possible.

IS NULL / IS NOT NULL cannot use indexes.

LIKE 'prefix%' can use an index; LIKE '%suffix' cannot.

Strings without quotes cause index failure.

Minimize OR conditions; they may trigger index merge.

Use =, IN, BETWEEN, <, > for indexable predicates; avoid <>, NOT IN, !=.

Reference

https://zh.wikipedia.org/wiki/B%E6%A0%91

https://medium.com/@mena.meseha/what-is-the-difference-between-mysql-innodb-b-tree-index-and-hash-index-ed8f2ce66d69

https://www.javatpoint.com/b-tree

https://blog.csdn.net/Abysscarry/article/details/80792876

《MySQL 实战 45 讲》

《高性能 MySQL》

InnoDBMySQLDatabase OptimizationB+Tree
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.