Why Large OFFSET Queries Slow Down MySQL and How to Speed Them Up
This article examines why a MySQL query with a large OFFSET is extremely slow, demonstrates the performance gap with a join‑based rewrite, and validates the findings by comparing InnoDB buffer‑pool page usage, highlighting how unnecessary random I/O pollutes the cache.
Introduction
First, the MySQL version is 5.7.17.
Table structure
mysql> desc test;
+--------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| val | int(10) unsigned | NO | MUL | 0 | |
| source | int(10) unsigned | NO | | 0 | |
+--------+---------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)id is an auto‑increment primary key, and val has a non‑unique index.
Insert 5,242,882 rows.
mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 5242882 |
+----------+
1 row in set (4.25 sec)When using select * from test where val=4 limit 300000,5; the query is slow because MySQL must scan 300,005 index leaf nodes and then fetch the same number of rows from the clustered index, discarding the first 300,000 rows.
mysql> select * from test where val=4 limit 300000,5;
+---------+-----+--------+
| id | val | source |
+---------+-----+--------+
| 3327622 | 4 | 4 |
| 3327632 | 4 | 4 |
| 3327642 | 4 | 4 |
| 3327652 | 4 | 4 |
| 3327662 | 4 | 4 |
+---------+-----+--------+
5 rows in set (15.98 sec)Rewriting the query as a join reduces the time dramatically:
mysql> select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id;
+---------+-----+--------+---------+
| id | val | source | id |
+---------+-----+--------+---------+
| 3327622 | 4 | 4 | 3327622 |
| 3327632 | 4 | 4 | 3327632 |
| 3327642 | 4 | 4 | 3327642 |
| 3327652 | 4 | 4 | 3327652 |
| 3327662 | 4 | 4 | 3327662 |
+---------+-----+--------+---------+
5 rows in set (0.38 sec)The difference is due to the number of random I/O operations on the clustered index.
Verification
We verified the hypothesis by checking InnoDB buffer‑pool pages after each query. The plain LIMIT query loads thousands of data pages, while the join loads only five.
select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY | 4098 |
| val | 208 |
+------------+----------+After the join:
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY | 5 |
| val | 390 |
+------------+----------+This confirms that the first query pollutes the buffer pool with many useless pages.
Problems Encountered
To ensure the buffer pool is cleared on each restart we disabled innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
