Why Do Database Indexes Speed Up Queries? A Deep Dive into Storage and Index Mechanics
This article explains how databases store data on physical storage devices, describes the role of indexes and binary search in accelerating query performance, discusses clustered versus non‑clustered indexes, outlines the trade‑offs of excessive indexing, and lists common SQL optimization techniques.
Overview
The development of human information storage has progressed from simple media to modern databases, where data is persisted on computer storage devices. Databases offer fast data access largely because of indexes.
Computer Storage Principles
Data persisted in a database resides on physical storage devices such as hard drives and RAM. Faster, more expensive storage (e.g., RAM) has lower latency, while slower, cheaper storage (e.g., hard disks) provides persistence. Hard disks consist of rotating platters divided into tracks and sectors; accessing data requires seeking, rotating, and reading sectors, which introduces mechanical overhead.
Locate the correct track and move the read/write head (seek).
Rotate the platter so the desired sector is under the head.
Read the data from the sector.
Data is typically stored on the slowest device (hard disk) for durability, while RAM caches frequently accessed data to avoid mechanical delays.
How Indexes Work
An index functions like a book's table of contents, allowing rapid location of rows without scanning the entire table. By pre‑sorting data, an index enables binary search, dramatically reducing the number of I/O operations needed to find a record.
For example, a dictionary without an index requires page‑by‑page scanning, whereas an index lets you jump directly to the relevant section.
Binary Search
Binary search requires sorted data and reduces search complexity from O(N) to O(log₂N). In a table of 100,000 rows stored in 20,000 blocks (each block holding 5 records of 204 bytes), a full scan would examine all 20,000 blocks, while binary search needs only about 14 comparisons.
固定记录大小=204字节,块大小=1024字节This demonstrates a speed‑up factor of roughly 800× compared to linear scanning.
Why Indexes Speed Up Queries
Indexes store data in a sorted structure (often a B‑tree), allowing the database engine to locate rows via binary search. Primary keys are ideal candidates because they are unique and naturally ordered.
Why Not Too Many Indexes
Having an index on every column can degrade performance, as the index itself becomes a large structure that must be consulted, similar to an overly detailed book index.
Drawbacks of Indexes
Each indexed column slows write operations because both the row and the index must be updated.
Indexes consume additional disk space.
Foreign‑key columns should be indexed to aid joins, but excessive indexing can be counterproductive.
Clustered Index
A clustered (or "clustered") index stores table rows in the same physical order as the index key, typically the primary key. Only one clustered index can exist per table. Non‑clustered indexes store pointers to the data rows.
Clustered indexes are beneficial for columns with many distinct values, range queries, columns used in ORDER BY/GROUP BY, and foreign‑key columns. They are unsuitable for frequently updated columns because row movement can be costly.
Typical Index Invalidations
Using OR in a WHERE clause can prevent index usage; prefer IN instead.
Common SQL Optimization Techniques
1. Avoid Full Table Scans
Ensure columns used in ON/WHERE clauses are indexed.
For very small tables, a full scan may be cheaper than using an index.
2. Prevent Index Loss
Do not apply functions or type conversions to indexed columns.
Use range conditions appropriately; avoid patterns that render the index ineffective (e.g., NOT EQUAL, IS NULL, leading wildcards in LIKE).
3. Prefer Index‑Based Sorting
When possible, let the index provide the required order instead of sorting after retrieval.
4. Select Only Needed Columns
Use covering indexes to avoid SELECT * and reduce I/O.
5. Minimize Temporary Table Usage
Avoid creating and dropping temporary tables when possible.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
