Databases 9 min read

Understanding Database Indexes: Structures, Types, and Best Practices

This article explains the fundamentals of database indexes, covering their purpose, underlying data structures such as B‑Tree, B+‑Tree, and hash tables, the advantages and drawbacks, different index categories in MySQL, and practical SQL examples for primary, secondary, and unique indexes.

Architect's Must-Have
Architect's Must-Have
Architect's Must-Have
Understanding Database Indexes: Structures, Types, and Best Practices

1. Basics of Indexes

1.1 What is an Index

An index is a data structure used for fast query and retrieval, such as B‑Tree, B+‑Tree, and hash tables. It functions like a catalog, enabling quick location of data.

Advantages of indexes : ① Significantly accelerate query speed; ② Uniqueness ensures each row is distinct. Disadvantages of indexes : ① Creation and maintenance consume time and can slow write operations; ② Indexes occupy physical storage space.

1.2 Underlying Data Structures

1.2.1 Hash Table

Hash tables use a hash algorithm to map a key to a value quickly.

hash = hashfunc(key)
index = hash % array_size

Hash collisions occur when different keys produce the same index. Common resolutions include open addressing , rehashing , and chaining . In JDK 1.8, HashMap switches to a red‑black tree when a bucket’s chain exceeds a threshold.

Why MySQL Does Not Use Hash Tables for Indexes

The main reason is that hash tables do not support ordered or range queries .

// Example of a range query
SELECT * FROM tb1 WHERE id < 500;
// Hash would need to compute a hash for every row, which is inefficient for range conditions.

1.2.3 B‑Tree and B+‑Tree

B‑Tree is a multi‑way balanced search tree; B+‑Tree is a variant where only leaf nodes store key‑value pairs. MyISAM and InnoDB use B+‑Tree for indexes. Indexes can be classified as primary (clustered) or secondary (non‑clustered) , and as clustered or non‑clustered based on whether data is stored together with the index.

Differences between B‑Tree and B+‑Tree: ① B‑Tree stores keys and values in all nodes; B+‑Tree stores them only in leaf nodes. ② B‑Tree leaf nodes are independent; B+‑Tree leaf nodes are linked. ③ B‑Tree may stop searching before reaching leaves; B+‑Tree always traverses from root to leaf, providing stable performance.

2. Types and Principles of Indexes

2.1 Classification by Data Separation

2.1.1 Clustered Index

A clustered index stores the index structure together with the data; primary key indexes are clustered.

In InnoDB, the .ibd file contains both index (B+‑Tree) and data. Non‑leaf nodes store index entries, leaf nodes store both index keys and the corresponding row data.

2.1.2 Non‑Clustered Index

A non‑clustered index stores the index structure separately from the data.

In MyISAM, the .MYI file holds the index (B+‑Tree). Leaf nodes store index keys and pointers to the data in the .MYD file. Secondary indexes are non‑clustered, so their leaf nodes may contain primary keys.

2.2 Classification by Primary Key Usage

2.2.1 Primary Key Index

The table's primary key uses a primary key index, which is clustered.

If InnoDB is used and no explicit primary key is defined, InnoDB will first look for a unique index to use as the primary key; otherwise it creates a hidden 6‑byte auto‑increment key.

Example: create table pl_ranking with id as primary key and run:

select id, plname, ranking from pl_ranking where id=16;

2.2.2 Secondary Index

Secondary indexes are non‑clustered; their leaf nodes store the primary key, allowing the primary key to be located via the secondary index.

Types of secondary indexes: ① Unique Index : enforces uniqueness but allows NULLs; multiple unique indexes can exist. ② Normal Index : improves query speed; duplicates and NULLs are allowed. ③ Prefix Index : applies to string columns, indexing only the leading characters to save space. ④ Full‑Text Index : used for searching keywords within large text fields.

Example: create table pl_ranking with plname as a secondary index and run:

select id, plname, ranking from pl_ranking where plname='Java';
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLMySQLhash tableB-TreeDatabase Indexes
Architect's Must-Have
Written by

Architect's Must-Have

Professional architects sharing high‑quality architecture insights. Covers high‑availability, high‑performance, high‑stability designs, big data, machine learning, Java, system, distributed and AI architectures, plus internet‑driven architectural adjustments and large‑scale practice. Open to idea‑driven, sharing architects for exchange and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.