Databases 5 min read

How to Efficiently Paginate 100M User IDs in MySQL

This article examines three SQL pagination strategies for a 100‑million‑row favorites table, compares their correctness and performance using EXPLAIN analysis, and demonstrates why a GROUP BY approach with proper indexing yields the most reliable and fast results.

21CTO

Oct 23, 2015

How to Efficiently Paginate 100M User IDs in MySQL

Programming skill is reflected in rigorous thinking; even seemingly simple problems can hide many subtle details.

Given a favorites table that stores user and book IDs with a data volume of 100 million rows, the task is to retrieve distinct user IDs in a paginated fashion.

Table definition:

CREATE TABLE favorites (
  id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  uid BIGINT UNSIGNED NOT NULL DEFAULT 0 COMMENT 'uid',
  status TINYINT(3) UNSIGNED NOT NULL DEFAULT 0 COMMENT 'status',
  book_id BIGINT UNSIGNED NOT NULL DEFAULT 0 COMMENT 'book Id',
  create_time INT(11) UNSIGNED NOT NULL DEFAULT 0 COMMENT 'create time',
  PRIMARY KEY (id),
  UNIQUE KEY uid_book_id (uid, book_id),
  KEY uid_status (uid, status)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=gbk COMMENT='User favorite info';

Three pagination designs

Design 1 – Simple LIMIT

SELECT DISTINCT uid FROM favorites ORDER BY uid DESC LIMIT 0,10;
SELECT DISTINCT uid FROM favorites ORDER BY uid DESC LIMIT 11,10; -- next page

This approach can lose data when rows are deleted between pages, causing gaps in the result set.

Design 2 – Using a last‑seen UID

-- First page
SELECT DISTINCT uid FROM favorites ORDER BY uid DESC LIMIT 10;
-- Subsequent pages
SELECT DISTINCT uid FROM favorites WHERE uid < $last_min_uid ORDER BY uid DESC LIMIT 10;

EXPLAIN shows it does not use the unique index; it scans a range of about 7 million rows, triggers a temporary table and filesort, leading to serious performance issues.

Design 3 – GROUP BY with HAVING

-- First page
SELECT uid FROM favorites GROUP BY uid ORDER BY uid DESC LIMIT 10;
-- Subsequent pages
SELECT uid FROM favorites GROUP BY uid HAVING uid < $last_min_uid ORDER BY uid DESC LIMIT 10;

This method leverages the composite index (uid, book_id), limiting the scan to roughly 12 hundred rows, avoiding temporary tables and filesorts, and therefore offers the best performance.

Analysis

Design 1 may miss user IDs if deletions occur during pagination. Design 2 suffers from inefficient index usage and high row scans. Design 3 provides accurate results with minimal row access, making it the preferred solution for large‑scale pagination.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL mysql pagination index Large Data

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.