Mastering Database Indexes: 10 Essential Q&A for Faster Queries
This guide walks developers through the fundamentals of database indexes, explaining why they matter, how they work, common pitfalls, and best‑practice design tips with concrete SQL examples and visual illustrations.
What is an index and why use it?
An index (typically a B+Tree) stores sorted references to rows, allowing the optimizer to perform an index scan instead of a full‑table scan.
-- Full table scan
SELECT * FROM users WHERE name = '苏三';
-- Create and use an index
CREATE INDEX idx_name ON users(name);
SELECT * FROM users WHERE name = '苏三';Why can an index be slow?
Leading wildcard : LIKE '%苏三%' disables index usage.
Low selectivity : many rows share the same value, reducing filtering power.
Back‑table lookup : the index does not contain all requested columns.
Solutions: remove the leading wildcard, create a covering index, or use a full‑text index for text search.
-- Avoid leading wildcard
SELECT * FROM users WHERE name LIKE '苏三%';
-- Covering index
CREATE INDEX idx_name_covering ON users(name, email, age);
SELECT name, email, age FROM users WHERE name = '苏三';Are more indexes always better?
Each index consumes storage and adds overhead to INSERT/UPDATE/DELETE because the index must be maintained. A practical rule is to keep the number of indexes per table below 5‑7.
Left‑most prefix rule for composite indexes
For a composite index (col1, col2), the optimizer can use it only if the query predicates start with col1. Queries filtering on col2 alone cannot use the index.
-- Usable
SELECT * FROM users WHERE name = '苏三';
SELECT * FROM users WHERE name = '苏三' AND age = 30;
-- Not usable (violates left‑most prefix)
SELECT * FROM users WHERE age = 30;Choosing column order in a composite index
Place high‑selectivity columns first.
Put frequently queried columns before rarely used ones.
Order equality predicates before range predicates.
-- Compute selectivity
SELECT COUNT(DISTINCT name)/COUNT(*) AS name_sel,
COUNT(DISTINCT age)/COUNT(*) AS age_sel,
COUNT(DISTINCT city)/COUNT(*) AS city_sel
FROM users;
-- Build index based on selectivity
CREATE INDEX idx_name_city_age ON users(name, city, age);Covering indexes
A covering index contains all columns required by the query, eliminating the need for a back‑table lookup and reducing I/O.
-- Non‑covering index (needs back‑table)
SELECT * FROM users WHERE name = '苏三';
-- Covering index (no back‑table)
CREATE INDEX idx_name_covering ON users(name, email, age);
SELECT name, email, age FROM users WHERE name = '苏三';NULL values and indexes
Indexes may not be used for IS NULL or IS NOT NULL predicates. Workarounds include using default values or functional indexes (MySQL 8.0+).
-- Functional index to treat NULL as empty string
CREATE INDEX idx_email_null ON users((COALESCE(email, '')));
SELECT * FROM users WHERE COALESCE(email, '') = '';Indexes and ORDER BY / GROUP BY
If the ORDER BY/GROUP BY clause matches the index column order, the optimizer can avoid extra sorting.
-- Index helps sorting
CREATE INDEX idx_age_name ON users(age, name);
SELECT * FROM users ORDER BY age, name; -- uses index
-- Different order requires filesort
SELECT * FROM users ORDER BY name, age; -- may require filesortDetecting index misuse
Common causes of index loss: functions on indexed columns, implicit type conversion, arithmetic on columns, and leading wildcards. Use EXPLAIN to inspect the execution plan and look for type = ALL or Using filesort.
EXPLAIN SELECT * FROM users WHERE name = '苏三';Index differences across databases
MySQL supports B+Tree, Hash, and Full‑text indexes.
PostgreSQL adds GiST, SP‑GiST, expression indexes, and covering indexes via INCLUDE.
Both support partial indexes.
Best‑practice checklist
Create indexes only for columns frequently used in queries.
Choose the appropriate index type for the workload.
Prefer composite indexes to reduce total index count.
Avoid over‑indexing; each index adds write overhead.
Periodically monitor usage (e.g., sys.schema_index_statistics in MySQL) and rebuild fragmented indexes.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
