Mastering MySQL Indexes: From Hash Tables to B+ Trees Explained
This article explains the fundamentals of database indexing, covering what indexes are, their advantages and drawbacks, underlying structures such as hash tables, B‑tree and B+‑tree, different index types (clustered, non‑clustered, primary, secondary) in MySQL, and practical examples with code snippets.
1. Index Basics
1.1 What is an Index
Index is a data structure for fast query and retrieval, such as B‑tree, B+‑tree, and hash table. It works like a directory, allowing quick location of entries.
Advantages: ① Greatly speeds up query rate; ② Uniqueness ensures each row is unique.
Disadvantages: ① Creation and maintenance consume time and can slow DML operations; ② Requires physical storage space.
1.2 Underlying Data Structures
1.2.1 Hash Table
Hash tables use a hash algorithm to map a key to a value quickly.
hash = hashfunc(key)
index = hash % array_sizeHash collisions occur when different keys map to the same index. Common resolutions include open addressing, rehashing, and chaining. In JDK 1.8 HashMap, when a bucket’s linked list exceeds a threshold (default 8), it is converted to a red‑black tree.
Why MySQL does not use hash tables for indexes – hash tables do not support ordered or range queries.
// Example of a range query
SELECT * FROM tb1 WHERE id < 500
// Hash would need to hash every row, which is inefficient for range conditions.1.2.3 B‑Tree and B+‑Tree
B‑tree is a multi‑way balanced search tree; B+‑tree is a variant. MyISAM and InnoDB use B+‑tree for indexes. From the primary key perspective, indexes are divided into primary and secondary. From whether index and data are stored together, they are clustered (e.g., InnoDB) and non‑clustered (e.g., MyISAM).
Differences between B‑tree and B+‑tree:
All nodes in B‑tree store both key and value; only leaf nodes in B+‑tree store values.
B‑tree leaf nodes are independent; B+‑tree leaf nodes are linked.
B‑tree may stop searching before reaching leaf; B+‑tree always traverses to leaf, giving stable performance.
2. Types of Indexes and Their Principles
2.1 Index Classification by Data Separation
2.1.1 Clustered Index
Clustered index stores index structure together with data. Primary key indexes are clustered.
In InnoDB, the .ibd file contains both index (B+‑tree) and data; non‑leaf nodes store index, leaf nodes store index and corresponding data.
2.1.2 Non‑Clustered Index
Non‑clustered index stores index structure separately from data.
In MyISAM, the .MYI file holds only the index (B+‑tree). Leaf and non‑leaf nodes store index; leaf nodes also store pointers to data in the .MYD file. Secondary indexes are non‑clustered, so their leaf nodes may store the primary key.
2.2 Index Classification by Primary Key
2.2.1 Primary Key Index
Tables use primary key indexes, which are clustered.
If InnoDB is used and no explicit primary key is defined, InnoDB first looks for a unique index to use as the primary key; otherwise it creates a hidden 6‑byte auto‑increment key.
Example: create table pl_ranking with id as primary key, then run:
select id, plname, ranking from pl_ranking where id=16;2.2.2 Secondary Index
Secondary indexes are non‑clustered; their leaf nodes store the primary key, so locating data requires an extra step via the primary key.
Secondary Index Types
Unique Index : enforces uniqueness (NULL allowed), often used for constraints.
Normal Index : improves query speed; duplicates and NULLs allowed.
Prefix Index : indexes the first few characters of string columns, reducing size.
Full‑Text Index : enables keyword search in large text fields, used by search engines.
Example: create table pl_ranking with plname as a secondary index, then run:
select id, plname, ranking from pl_ranking where plname='Java';Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
