Databases 26 min read

Understanding MySQL Indexes: Types, Data Structures, and Best Practices

This article explains MySQL indexing—especially InnoDB indexes—covering what indexes are, the various index types, underlying B‑Tree and B+Tree structures, clustered vs. secondary indexes, hash indexes, when to create indexes, and optimization techniques such as index condition push‑down.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Understanding MySQL Indexes: Types, Data Structures, and Best Practices

Preface

Because the default storage engine of MySQL is InnoDB, this article focuses on indexes under InnoDB and briefly mentions other engines, aiming to give readers a clearer understanding of MySQL indexes.

Index Introduction

First, we look at some common questions.

What is an index?

What kinds of indexes can we create?

Which columns are suitable for indexing?

Is more indexes always better?

Why are UUIDs or ID numbers not recommended as primary keys?

Why should we avoid SELECT * FROM table?

Do leading or trailing wildcards in LIKE affect index usage?

What Is an Index

In a relational database, an index is a separate physical structure that stores sorted values of one or more columns and contains logical pointers to the data pages where those values reside. It works like a book's table of contents, allowing fast location of needed rows.

Types of Indexes in MySQL

Ordinary Index

Ordinary indexes are the most basic type without special features; they can be created on any column.

-- Basic syntax for creating an index
CREATE INDEX indexName ON table(column(length));
-- Example (length can be omitted)
CREATE INDEX idx_name ON user(name);

Primary Key Index

Every table usually has a primary key; MySQL automatically creates a unique, non‑NULL index on it. This is a special unique index.

Composite Index

A composite (or combined) index uses multiple columns, e.g., a combination of ID card and phone number. It follows the left‑most prefix rule.

-- Basic syntax for creating a composite index
CREATE INDEX indexName ON table(column1(length), column2(length));
-- Example
CREATE INDEX idx_phone_name ON user(phone, name);

Full‑Text Index

Full‑text indexes are used for keyword search in text columns (CHAR, VARCHAR, TEXT). They work with MATCH ... AGAINST rather than simple WHERE ... LIKE clauses.

Spatial Index

Spatial indexes are built on spatial data types (GEOMETRY, POINT, LINESTRING, POLYGON) and require the column to be NOT NULL. They are only supported by the MyISAM engine.

Index Data Structures

B+Tree

InnoDB’s default index structure is a B+Tree, a balanced multi‑way search tree that stores keys in sorted order and keeps data pages linked for efficient range scans.

Binary Search Tree (BST)

A BST stores keys so that left children are smaller and right children are larger. Unbalanced BSTs can degenerate into linked lists, causing many node accesses.

AVL Tree

AVL trees are self‑balancing binary trees that keep the height difference between left and right sub‑trees at most one, ensuring logarithmic search time.

B‑Tree

B‑Tree stores multiple keys per node, reducing tree depth and I/O operations compared to AVL trees. It is a multi‑way balanced search tree.

B+Tree (plus version)

B+Tree moves all actual row data to leaf nodes and adds a linked list among leaves, providing stable I/O cost and enabling efficient range queries.

Clustered and Secondary (Non‑Clustered) Indexes

A clustered index stores rows in the same order as the index keys; a table can have only one. In InnoDB, the primary key is the clustered index, and the index contains the full row data.

Secondary indexes store only the indexed columns and a pointer (the primary key) to the clustered index row.

Hash Index

Hash indexes store a hash of the indexed column and a pointer to the row. They support only equality comparisons and are memory‑resident. InnoDB does not allow user‑defined hash indexes; only the MEMORY/NDB engines do.

Adaptive Hash Index (InnoDB)

InnoDB can automatically build a hash index on frequently accessed B‑Tree pages, improving point‑lookup performance without sacrificing transactional safety.

Advantages and Disadvantages of Hash Indexes

Fast for equality queries (=, IN, <=>) but cannot handle range queries.

Cannot avoid sorting or use partial index keys.

May cause contention under heavy workloads.

Low selectivity leads to many collisions and poorer performance than B‑Tree.

Q&A (Focused on InnoDB)

Why do secondary indexes store the primary‑key value instead of the row address? Because row addresses change as pages split or merge; storing the primary key remains stable.

Is more indexes always better? No. Excessive indexes increase storage size and cause extra page splits/merges during writes.

Why prefer auto‑increment IDs over UUIDs or ID numbers? Auto‑increment values insert at the end of the leaf page, preserving B+Tree balance. UUIDs are random, causing frequent page splits and larger index size.

Which Columns Should Be Indexed?

Columns frequently used in WHERE clauses.

Columns used for JOIN conditions.

Columns used for ORDER BY.

Columns used for GROUP BY.

Should We Index a Gender Column?

A test on a 3‑million‑row table showed that adding an index on a low‑cardinality column (gender) actually slowed queries because MySQL had to look up the secondary index then fetch rows via the primary key.

How to Estimate Index Selectivity

Use the formula COUNT(DISTINCT(column)) / COUNT(*). Higher distinct‑value ratio indicates a more useful index.

Does LIKE '%text' Use an Index?

If the pattern starts with a wildcard, the index cannot be used for that column. However, if another column in a composite index can be used (e.g., phone is first), the query may still use the index.

Index Condition Push‑Down (ICP)

ICP pushes part of the WHERE clause down to the storage engine, allowing it to filter rows using only the index entries, reducing row fetches and I/O.

ICP works for range, ref, eq_ref, and ref_or_null access methods on InnoDB secondary indexes.

References

Source: blog.csdn.net/qq_30062181/article/details/112712362

Additional interview question collections are linked at the end of the original article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

InnoDBmysqlDatabase OptimizationindexesB+Tree
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.