Why MySQL Pagination Is So Slow and How to Fix It
This article explores why MySQL queries with large LIMIT‑OFFSET pagination become painfully slow, explains the underlying B+‑tree index mechanics and MySQL's logical operator layers, and presents two practical solutions—key‑based pagination and index‑covering queries—to dramatically improve performance.
Starting from a Question
Five years ago I noticed that a MySQL query with pagination was extremely slow: selecting from a table with only 100k rows took 2‑3 seconds when using LIMIT 10 OFFSET 10000. My mentor asked what the time complexity of finding the N‑th largest value using an index is.
Chasing the Answer
Confirming the Scenario
Assuming an index on status, a query like SELECT * FROM table WHERE status = xx LIMIT 10 OFFSET 10000 becomes very slow, even with modest data size.
Initial Guess
I guessed O(log N) and was told to investigate myself.
Further Investigation
Analyzing the problem revealed that MySQL indexes are implemented as B+ trees, and scanning leaf nodes sequentially still incurs O(N) cost. Moreover, MySQL must read the clustered index for each row discarded by the offset, causing thousands of random I/Os.
The B+‑tree structure shows that leaf nodes form a linked list, allowing O(N) traversal to the 100‑th largest entry, but the offset still forces many unnecessary reads.
Systematic Learning
Recommended books: MySQL Technical Internals – InnoDB Storage Engine for deep understanding of MVCC, index implementation, and file storage, and High Performance MySQL for practical optimization techniques.
Two key concepts emerged:
Clustered index: primary key index whose leaf nodes contain the actual row data.
Secondary index: a non‑primary index whose leaf nodes store the primary key values.
When the engine processes LIMIT … OFFSET …, it first scans the secondary index, then fetches each matching primary key from the clustered index, resulting in many random I/Os.
Extending the Insight
After three years of work I examined source code of etcd and TiDB, discovering that a query is composed of logical operators.
Logical Operators
DataSource – the table source.
Selection – filter conditions (WHERE).
Projection – column selection (SELECT).
Join – combining tables.
Example: select b from t1, t2 where t1.c = t2.c and t1.a > 5 After logical planning, the query becomes DataSource → Join → Selection → Projection.
Thus, the limit‑offset cannot be pushed down to the storage engine because the logical operators hide the exact row count.
How to Solve It
Solution 1
Replace traditional pagination with “next/previous” based on the primary key or secondary index value, eliminating the need for large offsets.
Solution 2 – Index Covering
Use an index‑covering query so that all required columns are present in the secondary index, avoiding a lookup in the clustered index. Example:
select xxx, xxx from (select id from table where second_index = xxx limit 10 offset 10000) as t where id in (select id from t)This reduces random I/Os to the number of rows returned by the limit.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
