Analyzing and Optimizing MySQL Pagination Performance with Large Offsets
The article examines a production MySQL query that suffers severe slowdown due to large LIMIT offsets, demonstrates how to reproduce the issue with massive test data, analyzes the root cause, and presents three optimization strategies—including index covering, keyset pagination, and offset limiting—to dramatically improve query performance.
Background
After a new version was released, an unexpected surge of API calls hit a MySQL endpoint, causing the MySQL cluster to slow down dramatically. The logged request was a POST to a paginated API with offset=1800000 and limit=500, which translates to requesting page 3601 (1800000/500+1). Over 8,000 such calls were observed, far exceeding the normal page size of 25 items.
Problem Analysis
The query itself is syntactically fine and uses proper indexes, but the huge offset forces MySQL to scan and discard millions of rows before returning the requested 25 rows. For example:
SELECT * FROM t_name WHERE c_name1='xxx' ORDER BY c_name2 LIMIT 2000000,25;MySQL must read 2,000,025 rows and discard the first 2,000,000, which is extremely inefficient.
Data Simulation
To reproduce the issue, two tables ( dep and emp) are created, random data generators are defined, and stored procedures insert 5,000,000 employee rows and 120 department rows. After data insertion, indexes on the primary key and foreign key columns are added:
CREATE INDEX idx_emp_id ON emp(id);
CREATE INDEX idx_emp_depno ON emp(depno);
CREATE INDEX idx_dep_depno ON dep(depno);Testing the Original Query
Two test queries illustrate the performance gap:
SELECT a.empno, a.empname, a.job, a.sal, b.depno, b.depname
FROM emp a LEFT JOIN dep b ON a.depno = b.depno
ORDER BY a.id DESC
LIMIT 100,25; -- fast (0.001 s) SELECT a.empno, a.empname, a.job, a.sal, b.depno, b.depname
FROM emp a LEFT JOIN dep b ON a.depno = b.depno
ORDER BY a.id DESC
LIMIT 4800000,25; -- slow (12.275 s)The second query scans millions of rows before returning the last 25 rows.
Solution 1 – Index Covering + Sub‑query
First locate the starting id using a sub‑query, then fetch the next 25 rows based on that id:
SELECT a.empno, a.empname, a.job, a.sal, b.depno, b.depname
FROM emp a LEFT JOIN dep b ON a.depno = b.depno
WHERE a.id >= (SELECT id FROM emp ORDER BY id LIMIT 100,1)
ORDER BY a.id DESC
LIMIT 25;The same pattern works for the large offset (4,800,000) and reduces execution time from >12 s to ~0.1 s.
Solution 2 – Keyset Pagination (Remember Last Id)
Instead of using OFFSET, remember the last id of the previous page and query with WHERE id > last_id:
SELECT a.id, a.empno, a.empname, a.job, a.sal, b.depno, b.depname
FROM emp a LEFT JOIN dep b ON a.depno = b.depno
WHERE a.id > 100
ORDER BY a.id DESC
LIMIT 25;This approach always scans only the required rows, yielding sub‑millisecond response times. It is ideal for infinite‑scroll scenarios but not for random page jumps.
Solution 3 – Offset Limiting (De‑grade Strategy)
Set a maximum allowed offset; if a request exceeds it, return an empty result or a 4xx error. This prevents abusive “data‑scraping” and forces users to narrow their search criteria.
Conclusion
After applying the first two optimizations and a hard limit on offsets, the pagination performance improved dramatically. The article emphasizes the importance of testing extreme cases, adding rate‑limiting, and choosing the right pagination strategy based on the usage pattern.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
