Databases 12 min read

Why Do Database Indexes Speed Up Queries? A Deep Dive into Storage and Index Mechanics

This article explains how databases store data on physical storage devices, describes the role of indexes and binary search in accelerating query performance, discusses clustered versus non‑clustered indexes, outlines the trade‑offs of excessive indexing, and lists common SQL optimization techniques.

Programmer DD
Programmer DD
Programmer DD
Why Do Database Indexes Speed Up Queries? A Deep Dive into Storage and Index Mechanics

Overview

The development of human information storage has progressed from simple media to modern databases, where data is persisted on computer storage devices. Databases offer fast data access largely because of indexes.

Computer Storage Principles

Data persisted in a database resides on physical storage devices such as hard drives and RAM. Faster, more expensive storage (e.g., RAM) has lower latency, while slower, cheaper storage (e.g., hard disks) provides persistence. Hard disks consist of rotating platters divided into tracks and sectors; accessing data requires seeking, rotating, and reading sectors, which introduces mechanical overhead.

Locate the correct track and move the read/write head (seek).

Rotate the platter so the desired sector is under the head.

Read the data from the sector.

Data is typically stored on the slowest device (hard disk) for durability, while RAM caches frequently accessed data to avoid mechanical delays.

How Indexes Work

An index functions like a book's table of contents, allowing rapid location of rows without scanning the entire table. By pre‑sorting data, an index enables binary search, dramatically reducing the number of I/O operations needed to find a record.

For example, a dictionary without an index requires page‑by‑page scanning, whereas an index lets you jump directly to the relevant section.

Binary Search

Binary search requires sorted data and reduces search complexity from O(N) to O(log₂N). In a table of 100,000 rows stored in 20,000 blocks (each block holding 5 records of 204 bytes), a full scan would examine all 20,000 blocks, while binary search needs only about 14 comparisons.

固定记录大小=204字节,块大小=1024字节

This demonstrates a speed‑up factor of roughly 800× compared to linear scanning.

Why Indexes Speed Up Queries

Indexes store data in a sorted structure (often a B‑tree), allowing the database engine to locate rows via binary search. Primary keys are ideal candidates because they are unique and naturally ordered.

Why Not Too Many Indexes

Having an index on every column can degrade performance, as the index itself becomes a large structure that must be consulted, similar to an overly detailed book index.

Drawbacks of Indexes

Each indexed column slows write operations because both the row and the index must be updated.

Indexes consume additional disk space.

Foreign‑key columns should be indexed to aid joins, but excessive indexing can be counterproductive.

Clustered Index

A clustered (or "clustered") index stores table rows in the same physical order as the index key, typically the primary key. Only one clustered index can exist per table. Non‑clustered indexes store pointers to the data rows.

Clustered indexes are beneficial for columns with many distinct values, range queries, columns used in ORDER BY/GROUP BY, and foreign‑key columns. They are unsuitable for frequently updated columns because row movement can be costly.

Typical Index Invalidations

Using OR in a WHERE clause can prevent index usage; prefer IN instead.

Common SQL Optimization Techniques

1. Avoid Full Table Scans

Ensure columns used in ON/WHERE clauses are indexed.

For very small tables, a full scan may be cheaper than using an index.

2. Prevent Index Loss

Do not apply functions or type conversions to indexed columns.

Use range conditions appropriately; avoid patterns that render the index ineffective (e.g., NOT EQUAL, IS NULL, leading wildcards in LIKE).

3. Prefer Index‑Based Sorting

When possible, let the index provide the required order instead of sorting after retrieval.

4. Select Only Needed Columns

Use covering indexes to avoid SELECT * and reduce I/O.

5. Minimize Temporary Table Usage

Avoid creating and dropping temporary tables when possible.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancedatabaseSQL OptimizationindexstorageBinary SearchClustered Index
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.