Databases 11 min read

How to Efficiently Paginate 10 Million MySQL Records: Real‑World Tests & Optimizations

This article walks through creating a 10‑million‑row MySQL table, measuring pagination query times under different offsets and result sizes, and presents practical optimizations such as sub‑queries, ID‑range filtering, and column selection to dramatically improve performance while explaining why using SELECT * is discouraged.

Java High-Performance Architecture

May 9, 2023

How to Efficiently Paginate 10 Million MySQL Records: Real‑World Tests & Optimizations

Introduction

An interview question asks how to query ten million rows; the author demonstrates a hands‑on test using MySQL 5.7.26.

Data Preparation

A table user_operation_log is created and a stored procedure inserts ten million rows in batches, committing every 1,000 rows for speed.

CREATE TABLE `user_operation_log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `ip` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `op_data` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr1` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  ...
  PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 CHARACTER SET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=Dynamic;

DELIMITER ;;
CREATE PROCEDURE batch_insert_log()
BEGIN
  DECLARE i INT DEFAULT 1;
  DECLARE userId INT DEFAULT 10000000;
  SET @execSql = 'INSERT INTO `test`.`user_operation_log`(`user_id`,`ip`,`op_data`,`attr1`,`attr2`,`attr3`,`attr4`,`attr5`,`attr6`,`attr7`,`attr8`,`attr9`,`attr10`,`attr11`,`attr12`) VALUES';
  SET @execData = '';
  WHILE i<=10000000 DO
    SET @attr = '"测试很长很长的属性"';
    SET @execData = CONCAT(@execData, "(", userId+i, ",'10.0.69.175','用户登录操作'", ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ")");
    IF i % 1000 = 0 THEN
      SET @stmtSql = CONCAT(@execSql, @execData, ';');
      PREPARE stmt FROM @stmtSql;
      EXECUTE stmt;
      DEALLOCATE PREPARE stmt;
      COMMIT;
      SET @execData = '';
    ELSE
      SET @execData = CONCAT(@execData, ',');
    END IF;
    SET i = i + 1;
  END WHILE;
END;;
DELIMITER ;

Testing

The test machine is a low‑end Windows 10 i5 with a ~500 MB/s SSD. About 3.148 million rows (5 GB) are inserted, taking 38 minutes. SELECT count(1) FROM `user_operation_log`; Result: 3,148,000 rows. Three consecutive count queries take 14,060 ms, 13,755 ms, and 13,447 ms.

Simple Pagination

MySQL uses LIMIT to fetch a specific range; Oracle uses ROWNUM.

SELECT * FROM table LIMIT [offset,] rows | rows OFFSET offset;

Test query:

SELECT * FROM `user_operation_log` LIMIT 10000, 10;

Three runs take 59 ms, 49 ms, and 50 ms respectively.

Same offset, different result size

SELECT * FROM `user_operation_log` LIMIT 10000, 10;
SELECT * FROM `user_operation_log` LIMIT 10000, 100;
SELECT * FROM `user_operation_log` LIMIT 10000, 1000;
SELECT * FROM `user_operation_log` LIMIT 10000, 10000;
SELECT * FROM `user_operation_log` LIMIT 10000, 100000;
SELECT * FROM `user_operation_log` LIMIT 10000, 1000000;

Execution time grows with the number of rows returned, as shown in the chart.

Same result size, different offset

SELECT * FROM `user_operation_log` LIMIT 100, 100;
SELECT * FROM `user_operation_log` LIMIT 1000, 100;
SELECT * FROM `user_operation_log` LIMIT 10000, 100;
SELECT * FROM `user_operation_log` LIMIT 100000, 100;
SELECT * FROM `user_operation_log` LIMIT 1000000, 100;

Execution time increases with larger offsets, as illustrated in the second chart.

Optimization

Large Offset Problem

Sub‑query Method

SELECT * FROM `user_operation_log` LIMIT 1000000, 10;
SELECT id FROM `user_operation_log` LIMIT 1000000, 1;
SELECT * FROM `user_operation_log` WHERE id >= (SELECT id FROM `user_operation_log` LIMIT 1000000, 1) LIMIT 10;

The sub‑query uses the primary‑key index and is noticeably faster, though it only works when id is monotonically increasing.

ID Range Method

SELECT * FROM `user_operation_log` WHERE id BETWEEN 1000000 AND 1000100 LIMIT 100;
SELECT * FROM `user_operation_log` WHERE id >= 1000000 LIMIT 100;

This approach is extremely fast because it avoids the offset entirely.

Large Data‑Volume Problem

SELECT * FROM `user_operation_log` LIMIT 1, 1000000;
SELECT id FROM `user_operation_log` LIMIT 1, 1000000;
SELECT id, user_id, ip, op_data, attr1, attr2, attr3, attr4, attr5, attr6, attr7, attr8, attr9, attr10, attr11, attr12 FROM `user_operation_log` LIMIT 1, 1000000;

Fetching only the needed columns dramatically reduces query time, as shown in the chart.

Why Not SELECT * ?

Using SELECT * forces the database to parse all column metadata and can pull unnecessary large text fields, increasing both CPU load and network traffic, especially when the client is remote.

Conclusion

Readers are encouraged to reproduce the experiments and explore further optimizations on their own.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Optimization SQL mysql pagination Large Data

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.