Databases 11 min read

How to Efficiently Query Millions of Rows in MySQL: Pagination & Optimization

This article demonstrates how to generate, insert, and query a ten‑million‑row MySQL table, measures the performance of ordinary LIMIT pagination versus optimized techniques, and provides practical SQL snippets and tips for handling large offsets and massive data sets.

Architect's Guide
Architect's Guide
Architect's Guide
How to Efficiently Query Millions of Rows in MySQL: Pagination & Optimization

Preparation

To test pagination on a large dataset, a MySQL 5.7.26 database is used. Since inserting ten million rows directly would be extremely slow, a script is employed to generate the data efficiently.

Create Table

CREATE TABLE `user_operation_log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `ip` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `op_data` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr1` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr3` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr4` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr5` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr6` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr7` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr8` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr9` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr10` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr11` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr12` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 CHARACTER SET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=Dynamic;

Create Data Script

DELIMITER ;;
CREATE PROCEDURE batch_insert_log()
BEGIN
  DECLARE i INT DEFAULT 1;
  DECLARE userId INT DEFAULT 10000000;
  SET @execSql = 'INSERT INTO `test`.`user_operation_log`(`user_id`, `ip`, `op_data`, `attr1`, `attr2`, `attr3`, `attr4`, `attr5`, `attr6`, `attr7`, `attr8`, `attr9`, `attr10`, `attr11`, `attr12`) VALUES';
  SET @execData = '';
  WHILE i <= 10000000 DO
    SET @attr = "'测试很长很长很长...的属性'";
    SET @execData = CONCAT(@execData, "(", userId + i, ", '10.0.69.175', '用户登录操作'", ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ",", @attr, ")");
    IF i % 1000 = 0 THEN
      SET @stmtSql = CONCAT(@execSql, @execData, ';');
      PREPARE stmt FROM @stmtSql;
      EXECUTE stmt;
      DEALLOCATE PREPARE stmt;
      COMMIT;
      SET @execData = '';
    ELSE
      SET @execData = CONCAT(@execData, ",");
    END IF;
    SET i = i + 1;
  END WHILE;
END;;
DELIMITER ;

Test Execution

On a low‑end Windows 10 PC (≈500 MB SSD, i5 CPU), 3,148,000 rows were inserted, consuming ~5 GB disk. Counting rows took 38 minutes. The following query confirmed the row count: SELECT count(1) FROM `user_operation_log`; Result: 3,148,000 rows.

Ordinary Pagination Query

MySQL LIMIT syntax:

SELECT * FROM table LIMIT [offset,] rows | rows OFFSET offset;

Example query:

SELECT * FROM `user_operation_log` LIMIT 10000, 10;

Three runs yielded 59 ms, 49 ms, and 50 ms respectively, showing acceptable speed for small offsets on a local database.

Impact of Data Volume

Running the same offset (10,000) with increasing row counts (10, 100, 1 000, 10 000, 100 000, 1 000 000) demonstrated that larger result sets take longer. The chart below visualizes the trend:

Impact of Offset Size

Keeping the result size constant (100 rows) while increasing the offset (100, 1 000, 10 000, 100 000, 1 000 000) showed a clear increase in query time as the offset grew. The following chart illustrates this:

Optimization Strategies

Large Offset

Use a sub‑query to locate the starting id and then fetch rows based on that id, which leverages the primary‑key index:

SELECT * FROM `user_operation_log` WHERE id >= (
  SELECT id FROM `user_operation_log` LIMIT 1000000, 1
) LIMIT 10;

This approach reduces the cost of scanning large offsets. It works best when id is monotonically increasing.

For non‑monotonic ids, a nested IN query can be used (note: some MySQL versions may not support LIMIT inside IN).

SELECT * FROM `user_operation_log` WHERE id IN (
  SELECT t.id FROM (
    SELECT id FROM `user_operation_log` LIMIT 1000000, 10
  ) AS t
);

Large Result Set

Limit the columns returned; selecting only needed fields dramatically improves performance compared to SELECT *. Example:

SELECT id, user_id, ip FROM `user_operation_log` LIMIT 1, 1000000;

Benchmark results show that fetching fewer columns reduces both parsing overhead and network transfer size.

Why Avoid SELECT *

Database must resolve metadata for every column, increasing parsing cost.

Unnecessary columns increase network payload, especially when the client is remote.

Conclusion

When dealing with tens of millions of rows, avoid large offsets and SELECT *. Prefer index‑based range queries or sub‑queries, and always select only the columns you need. These techniques keep query latency low even as data volume grows.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

query optimizationmysqlpaginationlarge dataset
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.