Why OFFSET/LIMIT Slows Down Large Datasets and How Cursor Pagination Fixes It
This article explains why using OFFSET and LIMIT for pagination becomes a performance bottleneck on massive tables, illustrates the cost of full‑table scans, and introduces a cursor‑based pagination technique that leverages primary‑key ordering for efficient data retrieval.
Understanding the Offset Problem
When dealing with huge amounts of data, the traditional OFFSET and LIMIT pagination approach can degrade performance dramatically because the database must scan and discard rows before reaching the requested page.
For small datasets this method works, but as the table size grows beyond what fits in memory, each pagination request triggers a full‑table scan, which is the slowest type of query due to massive disk I/O and memory transfer overhead.
For example, with 100 million rows and an offset of 50 million, the database still reads all 50 million preceding rows, loads them into memory, and then returns the 20 rows requested by LIMIT. This results in query times that can be dozens of times slower than necessary.
OFFSET and LIMIT: What’s the problem?
Running a benchmark on a 100 k‑row table shows the original query taking at least 30× longer than an optimized version. 10万行中的第5万行到第5万零20行 See the live comparison at https://www.db-fiddle.com/f/3JSpBxVgcqL3W2AzfRNCyq/1.
Cursor‑Based Pagination as an Alternative
Alternative Solution
Instead of storing OFFSET, keep the last retrieved primary‑key (or a unique sequential column) and the LIMIT. Each subsequent query uses a condition like WHERE id > last_id ORDER BY id ASC LIMIT 20, allowing the database to jump directly to the next page using the index.
This method requires a unique, indexed column (e.g., an auto‑increment ID or timestamp). It eliminates the need to scan irrelevant rows, drastically reducing query time.
Example comparison:
Original query (12.80 s) vs. optimized cursor query (0.01 s).
If a table lacks a suitable primary key, you may still need to fall back to OFFSET/LIMIT, but be aware of the potential slow‑query issues.
Recommendation: use an auto‑increment primary key for any table that requires pagination.
For deeper guidance on handling large‑scale queries, refer to Rick James’s article at http://mysql.rjweb.org/doc.php/lists.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
