Understanding MySQL Indexes: Types, B+Tree vs Hash, and Practical Examples
This article explains MySQL index fundamentals, classifies indexes by data structure, storage, field characteristics and column count, compares B+Tree with B‑Tree, hash and red‑black trees, and demonstrates clustered and secondary indexes with practical SQL examples and diagrams.
What is an Index?
An index is a data structure that helps the storage engine retrieve data efficiently, similar to a directory for quick location.
Index Classification
Indexes can be classified from several perspectives:
By Data Structure
B+Tree
Hash
Full‑text
By Physical Storage
Clustered index
Secondary (auxiliary) index
By Field Characteristics
Primary key index
Unique index
Normal index
Prefix index
By Number of Columns
Single‑column index
Composite (multi‑column) index
Data‑Structure View of Indexes
Common MySQL storage engines support the following index types:
InnoDB: B+Tree, Full‑text (no Hash)
MyISAM: B+Tree, Full‑text (no Hash)
Memory: B+Tree, Hash (no Full‑text)
In practice, InnoDB is the default storage engine for MySQL tables.
The B+Tree index is the most widely used index type in MySQL.
B+Tree vs B‑Tree
1970, R. Bayer and E. McCreight proposed the B‑Tree, a balanced multi‑way tree used for disk indexing and database indexes.
B+Tree is a variant of B‑Tree; B+Tree stores data only in leaf nodes, while B‑Tree also stores data in internal nodes.
Because leaf nodes contain only keys, a B+Tree node holds fewer entries, allowing more keys per node and reducing the number of disk I/Os needed for a search.
Leaf nodes are linked by a singly linked list, which benefits range queries in MySQL.
B+Tree vs Red‑Black Tree
For a B+Tree with N leaf nodes and degree d, the search complexity is O(log d N). In practice d > 100, so even with millions of rows the tree height stays around 3‑4, requiring only 3‑4 disk I/Os.
Red‑Black trees are binary (degree 2) with O(log N) complexity, leading to more I/O operations than B+Tree.
B+Tree Index vs Hash Table
Hash tables are suitable for equality queries but not for range queries, and they suffer from hash‑function selection and collision issues. Therefore B+Tree indexes have broader applicability.
Physical Storage View of Indexes
InnoDB Indexes
InnoDB uses a clustered index for the primary key and secondary indexes for other columns.
Clustered index leaf nodes store full rows; secondary index leaf nodes store only the primary‑key values.
Example DDL to create a test table:
create table workers (
id int(11) not null auto_increment comment 'employee id',
name varchar(16) not null comment 'employee name',
sales int(11) default null comment 'sales',
primary key (id)
) engine=InnoDB auto_increment=10 default charset=utf8;Insert sample data:
insert into workers(id, name, sales) values (1, '江南', 12744);
insert into workers(id, name, sales) values (3, '今何在', 14082);
insert into workers(id, name, sales) values (7, '路明非', 14738);
insert into workers(id, name, sales) values (8, '吕归尘', 7087);
insert into workers(id, name, sales) values (11, '姬野', 8565);
insert into workers(id, name, sales) values (15, '凯撒', 8501);
insert into workers(id, name, sales) values (20, '绘梨衣', 7890);Create a secondary index on the name column:
alter table workers add index index_name(name);Secondary index leaf nodes store the primary‑key values (e.g., id), so a lookup requires a “back‑lookup” to the clustered index to retrieve the full row.
When the query only needs columns present in the secondary index, MySQL can perform an index‑only scan (covering index) and avoid the back‑lookup:
select id, name from workers where name='吕归尘';
explain select id, name from workers where name='吕归尘';The EXPLAIN output shows “Using where; Using index”, indicating an index‑only scan.
MyISAM Indexes
MyISAM does not have clustered indexes; both primary and secondary indexes have the same structure, storing only the row’s address in leaf nodes.
Index Field Characteristics
Primary Key Index
Built on the primary‑key column.
Only one per table.
Column values cannot be NULL.
Usually created when the table is defined.
Unique Index
Built on columns declared UNIQUE.
Multiple unique indexes per table; column values may be NULL.
Normal Index
Indexes on non‑primary, non‑unique columns.
Prefix Index
Indexes only the first N characters (or bytes) of a string or binary column, reducing storage and improving query efficiency.
create index index_name on persons(name(5)) comment 'prefix index';
show index from persons;Number of Columns in an Index
Single‑column index.
Composite (multi‑column) index.
Example of creating a composite index on workers(id, name):
create index index_id_name on workers(id, name) comment 'composite index';The composite index stores both column values as the key; ordering follows the column list.
Java Interview Crash Guide
Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
