Databases 7 min read

Why MySQL Pagination Is So Slow and How to Fix It

This article explores why MySQL queries with large LIMIT‑OFFSET pagination become painfully slow, explains the underlying B+‑tree index mechanics and MySQL's logical operator layers, and presents two practical solutions—key‑based pagination and index‑covering queries—to dramatically improve performance.

Programmer DD

Apr 6, 2021

Why MySQL Pagination Is So Slow and How to Fix It

Starting from a Question

Five years ago I noticed that a MySQL query with pagination was extremely slow: selecting from a table with only 100k rows took 2‑3 seconds when using LIMIT 10 OFFSET 10000. My mentor asked what the time complexity of finding the N‑th largest value using an index is.

Chasing the Answer

Confirming the Scenario

Assuming an index on status, a query like SELECT * FROM table WHERE status = xx LIMIT 10 OFFSET 10000 becomes very slow, even with modest data size.

Initial Guess

I guessed O(log N) and was told to investigate myself.

Further Investigation

Analyzing the problem revealed that MySQL indexes are implemented as B+ trees, and scanning leaf nodes sequentially still incurs O(N) cost. Moreover, MySQL must read the clustered index for each row discarded by the offset, causing thousands of random I/Os.

The B+‑tree structure shows that leaf nodes form a linked list, allowing O(N) traversal to the 100‑th largest entry, but the offset still forces many unnecessary reads.

Systematic Learning

Recommended books: MySQL Technical Internals – InnoDB Storage Engine for deep understanding of MVCC, index implementation, and file storage, and High Performance MySQL for practical optimization techniques.

Two key concepts emerged:

Clustered index: primary key index whose leaf nodes contain the actual row data.

Secondary index: a non‑primary index whose leaf nodes store the primary key values.

When the engine processes LIMIT … OFFSET …, it first scans the secondary index, then fetches each matching primary key from the clustered index, resulting in many random I/Os.

Extending the Insight

After three years of work I examined source code of etcd and TiDB, discovering that a query is composed of logical operators.

Logical Operators

DataSource – the table source.

Selection – filter conditions (WHERE).

Projection – column selection (SELECT).

Join – combining tables.

Example: select b from t1, t2 where t1.c = t2.c and t1.a > 5 After logical planning, the query becomes DataSource → Join → Selection → Projection.

Thus, the limit‑offset cannot be pushed down to the storage engine because the logical operators hide the exact row count.

How to Solve It

Solution 1

Replace traditional pagination with “next/previous” based on the primary key or secondary index value, eliminating the need for large offsets.

Solution 2 – Index Covering

Use an index‑covering query so that all required columns are present in the secondary index, avoiding a lookup in the clustered index. Example:

select xxx, xxx from (select id from table where second_index = xxx limit 10 offset 10000) as t where id in (select id from t)

This reduces random I/Os to the number of rows returned by the limit.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql pagination Index Optimization Database Performance B+Tree

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.