How to Efficiently Query Millions of Rows in MySQL: Pagination Tricks and Optimizations
This article walks through generating a 10‑million‑row MySQL table, inserting data with a fast batch script, measuring plain LIMIT pagination performance, and applying several optimization techniques—including sub‑queries, ID‑range filtering, and column selection—to dramatically reduce query time on large data sets.
Introduction
An interview question asks how to query ten million rows; the answer is to use LIMIT pagination, but the article demonstrates the whole process.
Data Preparation
A table user_operation_log is created with many varchar columns. Inserting ten million rows one by one is impractical, so a stored procedure batch_insert_log() inserts rows in batches of 1000 with commits.
CREATE PROCEDURE batch_insert_log()
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE userId INT DEFAULT 10000000;
SET @execSql = 'INSERT INTO `test`.`user_operation_log`(`user_id`, `ip`, `op_data`, `attr1`, `attr2`, `attr3`, `attr4`, `attr5`, `attr6`, `attr7`, `attr8`, `attr9`, `attr10`, `attr11`, `attr12`) VALUES';
SET @execData = '';
WHILE i <= 10000000 DO
SET @attr = "'测试很长很长...的属性'";
SET @execData = CONCAT(@execData, '(', userId + i, ", '10.0.69.175', '用户登录操作', ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ',', @attr, ')');
IF i % 1000 = 0 THEN
SET @stmtSql = CONCAT(@execSql, @execData, ';');
PREPARE stmt FROM @stmtSql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
COMMIT;
SET @execData = '';
ELSE
SET @execData = CONCAT(@execData, ',');
END IF;
SET i = i + 1;
END WHILE;
END;;Test Environment
The test runs on a low‑end Windows 10 i5 machine with a 500 MB SSD; 3.148 M rows (≈5 GB) are inserted, taking about 38 minutes.
Plain LIMIT Pagination Test
Three runs of SELECT * FROM user_operation_log LIMIT 10000, 10 take 59 ms, 49 ms and 50 ms respectively, showing acceptable speed on a local DB.
Impact of Result Size
Increasing the LIMIT row count (10, 100, 1000, 10000, 100000, 1000000) shows query time grows with result size.
Impact of Offset
Keeping the row count constant (100) while increasing the offset (100, 1000, 10000, 100000, 1000000) demonstrates that larger offsets increase query time.
Optimization Strategies
1. Reduce large offsets : locate the starting ID with a sub‑query, then query by ID range.
SELECT * FROM user_operation_log LIMIT 1000000, 10;
SELECT id FROM user_operation_log LIMIT 1000000, 1;
SELECT * FROM user_operation_log WHERE id >= (SELECT id FROM user_operation_log LIMIT 1000000, 1) LIMIT 10;This uses the primary‑key index and is much faster, but only works when IDs are monotonically increasing.
2. ID‑range filtering : when IDs are continuous, use WHERE id BETWEEN x AND y or WHERE id >= x with a LIMIT.
SELECT * FROM user_operation_log WHERE id BETWEEN 1000000 AND 1000100 LIMIT 100;
SELECT * FROM user_operation_log WHERE id >= 1000000 LIMIT 100;These queries run very quickly.
3. Avoid SELECT * : selecting only needed columns reduces parsing, permission checks, and network transfer.
Using SELECT * forces the server to retrieve all column metadata and data, increasing CPU and I/O.
Fetching fewer columns can cut query time noticeably.
Conclusion
Large data sets and high offsets hurt pagination performance; using indexed ID lookups, limiting result columns, and avoiding full table scans dramatically improve speed. Readers are encouraged to try the provided scripts and explore further optimizations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
