MySQL Index Fundamentals: Types, Structures, and Design Principles
This article introduces MySQL index basics, covering index concepts, advantages and disadvantages, various index types such as B‑Tree, B+Tree, hash, and full‑text, their storage structures, creation syntax, left‑most prefix and covering indexes, and practical design guidelines for effective indexing.
1. Getting Started with MySQL Indexes
MySQL defines an index as an ordered data structure that helps the server retrieve rows efficiently. The index stores key values and pointers to the physical location of the corresponding rows, allowing fast lookup using tree‑based search algorithms.
The left side shows a table with two columns and seven rows; the right side shows a binary search tree that maps the indexed column (Col 2) to the physical address of each row. Because indexes are usually stored on disk, they are kept in separate index files to improve query performance.
1.1 Overview of Indexes
Indexes act like a book's table of contents, providing an ordered structure that points to data rows, enabling high‑speed retrieval.
1.2 Advantages and Disadvantages of Indexes
Improves data retrieval efficiency and reduces I/O cost.
Allows sorting by index, lowering CPU usage.
Disadvantages:
Indexes occupy space because they are separate tables storing key‑value pairs.
Maintaining indexes slows down INSERT, UPDATE, DELETE operations, as the engine must also modify the index structures.
1.3 When to Create Indexes
Indexes are a crucial part of application design. Too many indexes increase write overhead, while too few hurt read performance. Developers should add indexes early based on data access patterns rather than relying on DBAs to add them later.
2. Index Data Structures
2.1 Index Types and Engine Support
MySQL implements indexes at the storage‑engine layer. The four main index types are:
B‑TREE: supported by most engines.
HASH: supported only by the Memory engine.
R‑TREE (spatial): supported by MyISAM for geographic data.
FULL‑TEXT: supported by MyISAM; InnoDB supports it from version 5.6.
Index
InnoDB
MyISAM
Memory
B‑TREE
Supported
Supported
Supported
HASH
Not supported
Not supported
Supported
R‑TREE
Not supported
Supported
Not supported
FULL‑TEXT
Supported from 5.6
Supported
Not supported
Most non‑specialized indexes are B+‑Tree structures (multi‑way balanced trees). The following sections describe B‑Tree and B+‑Tree in detail.
2.2 B‑Tree Structure
A B‑Tree is an m‑ary balanced search tree with the following properties:
Each node has at most m children.
Every non‑root, non‑leaf node has at least ⌈m/2⌉ children.
The root has at least two children if it is not a leaf.
All leaf nodes are on the same level.
Each internal node contains n keys and n+1 pointers, where ⌈m/2⌉‑1 ≤ n ≤ m‑1.
Example: inserting the letters C N G A H E K Q M into a 5‑ary B‑Tree demonstrates node splits and key promotions.
2.3 B+‑Tree Structure
A B+‑Tree is a variant of B‑Tree where:
Leaf nodes store all keys and the actual row pointers.
Internal nodes store only keys as an index, not the data.
All leaves are linked together in a sorted linked list, enabling efficient range scans.
3. Why MySQL Uses B+‑Tree
3.1 Disk I/O and Prefetching
Disk access time consists of seek time (~5 ms), rotational latency (~4 ms), and transfer time (negligible). A single I/O therefore costs about 9 ms, during which a modern CPU can execute millions of instructions, making I/O a major bottleneck.
Operating systems mitigate this by prefetching adjacent pages (typically 4 KB or 8 KB) into memory, reducing the number of I/O operations required for index traversal.
3.2 B+‑Tree in MySQL
Because B+‑Tree nodes are stored on disk pages, a lookup for a key may involve several page reads. For example, finding the value 3 may require loading three pages (root, intermediate, leaf) before the row data is accessed.
3.2.1 Why B+‑Tree Instead of B‑Tree?
Higher space utilization (no data stored in internal nodes).
Consistent search depth, leading to stable query performance.
Supports both random and sequential access efficiently.
Insert/delete operations are cheaper because only leaf nodes hold the data.
3.2.2 What About Hash Indexes?
Hash indexes cannot perform range queries (e.g., SELECT * FROM table WHERE age > 10).
They do not support pattern matching (e.g., LIKE 'JoJo%').
While fast for equality lookups, hash collisions can degrade performance dramatically.
4. Index Categories
4.1 Single‑Column Index
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name)
);
CREATE INDEX idx_customer_name ON customer(customer_name);
DROP INDEX idx_customer_name;4.2 Primary Key Index
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id)
);
ALTER TABLE customer ADD PRIMARY KEY (customer_no);
ALTER TABLE customer DROP PRIMARY KEY;4.3 Unique Index
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name),
UNIQUE (customer_no)
);
CREATE UNIQUE INDEX idx_customer_no ON customer(customer_no);
DROP INDEX idx_customer_no ON customer;4.4 Composite (Compound) Index
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name),
UNIQUE (customer_name),
KEY (customer_no,customer_name)
);
CREATE INDEX idx_no_name ON customer(customer_no,customer_name);
DROP INDEX idx_no_name ON customer;4.5 Leftmost Prefix Matching
If a composite index is defined as CREATE INDEX idx_a_b_c ON demo_table(a, b, c);, queries that filter on a, a,b, or a,b,c can use the index. Queries that start with b or c alone cannot, unless the selected columns are covered by the index.
4.6 Covering Index
A covering index contains all columns needed by a query, allowing MySQL to satisfy the query using only the index without touching the table rows. Example: SELECT b, c FROM demo_table WHERE a = "xxx"; Running EXPLAIN shows Using index in the Extra column, indicating a covering index is used.
4.7 Clustered Index
In InnoDB, the primary key is implemented as a clustered index: the B‑Tree stores both the key and the full row data, so locating the key directly yields the row.
4.8 Secondary (Non‑Clustered) Index
Secondary indexes store only the indexed columns and a pointer to the row. MyISAM caches indexes in the key buffer; when a query uses a secondary index, MySQL first searches the index in memory, then follows the pointer to fetch the actual row from disk.
5. Index Design Principles
Create indexes on columns that are frequently used in WHERE clauses and on large tables.
Select columns with high selectivity; combine columns when the query filters on multiple conditions.
Prefer unique indexes when possible, as they provide better discrimination.
Avoid excessive indexes; each additional index adds write overhead and can cause the optimizer to spend more time choosing an index.
Use prefix indexes for long string columns to reduce index size and improve I/O efficiency.
Leverage the leftmost‑prefix rule for composite indexes to maximize their usefulness.
References:
MySQL Index Principles – https://www.cnblogs.com/exceptioneye/p/5452068.html
MySQL Index Classification – https://blog.csdn.net/qq_17707713/article/details/90059408
MySQL Optimization – Index Chapter – https://www.cnblogs.com/cryin107/p/13166865.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
