Databases 8 min read

Understanding MySQL Indexes: Data Structures and Implementation Principles

This article explains why slow SQL queries occur, how MySQL indexes dramatically improve query speed, and details the underlying data structures—binary tree, hash, B‑Tree, B+Tree—as well as storage engine differences and composite index rules for optimal performance.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Understanding MySQL Indexes: Data Structures and Implementation Principles

During data queries, slow SQL often causes online problems; using MySQL indexes can greatly increase query speed. This article first explains what an index is and why it improves performance.

It then describes the disk‑access principle: each data lookup triggers an I/O operation, so reducing I/O through caching and efficient structures speeds up queries.

1. What an Index Is – An index is a sorted data structure stored on disk that helps MySQL retrieve rows quickly.

2. Disk Access Principle – Data resides on disk; each read is an I/O. Caching reduces I/O, but the underlying index still determines how many disk reads are needed.

3. MySQL Data Structure Details

3.1 Binary Tree – Stores single indexes; height grows with data size, offering faster lookup than linear scans.

3.2 Red‑Black Tree – Self‑balancing, avoids single‑sided growth, but MySQL does not use it for large datasets because tree height can still become large.

3.3 Hash – Provides O(1) lookup in memory, but on disk the hash buckets are non‑contiguous, making range queries impossible and overall disk performance poor.

3.4 B‑Tree – Adds a degree (horizontal size) to reduce height, decreasing the number of I/O operations needed.

3.5 B+Tree – Stores data pages (typically 4 KB) as leaf nodes; a single I/O reads an entire page, improving sequential access speed.

MySQL uses two main storage engines:

MyISAM – Consists of .frm (table definition), .myd (data), and .myi (index) files; primary key indexes are stored separately from data.

InnoDB – Data and indexes are stored together in .ibd files using a clustered B+Tree; the leaf nodes contain full row data for the primary key.

Composite (multi‑column) indexes follow the "leftmost" rule: the index is sorted by the first column, then the second, etc. Queries can use the index only if they filter on the leftmost columns in order.

Examples illustrate when an index can or cannot be used based on column order, demonstrating the importance of designing indexes that match query patterns.

END

performanceInnoDBMySQLdata structuresIndexB-Tree
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.