Databases 8 min read

Optimizing Large-Scale Pagination Queries in MySQL: Data Generation and Index Strategies

This article demonstrates how to generate millions of test rows in MySQL, analyzes the performance impact of deep pagination using LIMIT, explains why non‑clustered index lookups cause costly table scans, and presents two optimization approaches—sub‑query ID filtering and key‑set pagination—to dramatically reduce query latency.

Top Architect
Top Architect
Top Architect
Optimizing Large-Scale Pagination Queries in MySQL: Data Generation and Index Strategies

The author, a senior architect, first creates a test table and a stored function CREATE DEFINER=`mytoor`@`%` FUNCTION `JcTestData`() RETURNS int(11) BEGIN DECLARE num INT DEFAULT 2000000; DECLARE i INT DEFAULT 0; WHILE i < num DO INSERT INTO test_order(`platform_sn`,`third_sn`,`type`,`create_time`) VALUES(CONCAT('SN',i),UUID(),1,now()); SET i = i + 1; END WHILE; RETURN i; END to generate 2 000 000 rows quickly (about 29 seconds).

After the data is ready, the article runs a simple pagination query SELECT * FROM test_order WHERE type=1 LIMIT 0,50; , which returns results in 0.022 seconds because the offset is small.

When the offset grows, the same query SELECT * FROM test_order WHERE type=1 LIMIT 1200000,50; takes 3.765 seconds. The EXPLAIN output shows the non‑clustered index index_type is used, but a costly “back‑table” (row‑lookup) operation occurs because the query selects all columns.

**Reason 1:** The non‑clustered index only contains type and the primary key id . Retrieving the remaining columns forces a second lookup on the clustered index, slowing the query.

**Reason 2:** The LIMIT offset forces MySQL to scan and discard the first 1 200 000 rows before returning the next 50, which is inherently inefficient.

**Solution 1 – Sub‑query ID filtering:** Retrieve the primary keys first and then fetch the full rows: SELECT * FROM test_order WHERE id IN (SELECT id FROM (SELECT id FROM test_order WHERE type=1 LIMIT 1200000,50) child); This reduces the execution time to 1.56 seconds. **Solution 2 – Key‑set pagination (minimum ID condition):** Use the last fetched id as the starting point for the next page: SELECT * FROM test_order WHERE type=1 AND id >= 1200008 LIMIT 50; This query runs in 0.022 seconds, matching the performance of the first page because it can use the primary‑key index directly. The article advises developers to store the last row’s id after each batch and use it as the lower bound for the next batch, eliminating the expensive offset scan. It also suggests front‑end infinite‑scroll implementations can pass this minimal id to the back‑end to achieve smooth, fast pagination.

PerformanceSQLData GenerationMySQLpaginationIndex Optimization
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.