Databases 18 min read

Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It

The article explains how deep pagination on massive MySQL tables causes full‑table scans, massive I/O, and memory pressure, then presents six concrete optimization techniques—including delayed join, cursor pagination, covering indexes, ID‑range pagination, caching, and partitioning—backed by a real‑world e‑commerce case study and detailed execution‑plan analysis.

Architecture & Thinking
Architecture & Thinking
Architecture & Thinking
Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It

Incident Overview

At 7 pm a monitoring alarm showed that the order‑query API of a user centre jumped from 300 ms to 15 s and MySQL CPU hit 95 %.

1. Why Deep Pagination Becomes a Performance Nightmare

In systems with millions of rows a simple LIMIT offset, size query forces MySQL to scan and sort the entire offset range. Example query:

SELECT * FROM user_orders
WHERE user_id = 12345
ORDER BY order_time DESC
LIMIT 1024000, 20;

As the page number grows, response time grows linearly; the 51 201‑st page may take dozens of seconds.

Root causes :

Full table scan and sort : MySQL must read 1 024 020 rows into memory.

Massive useless I/O : Only the last 20 rows are needed.

Memory pressure : Large result sets may spill to disk.

2. Execution‑Plan Deep Dive

Running EXPLAIN on the query shows type=ALL, no index, rows≈1 024 020 and Extra=Using filesort. The table below summarises the key fields and optimisation suggestions.

type | key   | rows    | Extra
----+-------+---------+-------------------
ALL  | NULL  | 1024020 | Using where; Using filesort

Adding a composite index on (status, order_time) changes the plan to type=ref, uses the new index and eliminates filesort.

type | key               | rows | Extra
----+-------------------+------+------
ref  | idx_status_time   |  20  | NULL

3. Six Practical Performance‑Boosting Solutions

Solution 1 – Delayed Join (Most General)

Idea : Use a covering index to fetch primary‑key IDs first, then join to retrieve full rows.

-- Before (slow)
SELECT * FROM user_orders
WHERE user_id = 1001
ORDER BY order_time DESC
LIMIT 80000,20;

-- After (fast)
SELECT o.* FROM user_orders o
INNER JOIN (
    SELECT id FROM user_orders
    WHERE user_id = 1001
    ORDER BY order_time DESC
    LIMIT 80000,20
) tmp ON o.id = tmp.id;

Performance comparison: scan 80 020 rows → 8.5 s vs scan 20 rows → 120 ms (≈70× faster).

Solution 2 – Cursor Pagination (Continuous Scrolling)

Suitable for infinite scroll or chat history. The next page uses the last send_time and id as a cursor.

-- First page
SELECT * FROM messages
WHERE chat_id = 500
ORDER BY send_time DESC, id DESC
LIMIT 20;

-- Next page
SELECT * FROM messages
WHERE chat_id = 500
  AND (send_time < '2024-03-07 10:30:00'
       OR (send_time = '2024-03-07 10:30:00' AND id < 7890))
ORDER BY send_time DESC, id DESC
LIMIT 20;

Advantages: O(1) time complexity, stable performance regardless of total rows.

Solution 3 – Covering‑Index Optimization

Create a composite index that contains every column used in SELECT and WHERE so the engine can satisfy the query from the index alone.

ALTER TABLE products
ADD INDEX idx_category_price_name (category_id, price DESC,
                                   product_name, stock_count,
                                   create_time);
SELECT product_id, product_name, price, stock_count
FROM products
WHERE category_id = 3
ORDER BY price DESC
LIMIT 40000,20;

Key points: index column order must match query predicates; keep index size reasonable.

Solution 4 – ID‑Range Pagination

When the primary key is monotonic, paginate by a range of IDs instead of offset.

-- Traditional (slow)
SELECT * FROM logs
WHERE type='error'
ORDER BY id DESC
LIMIT 250000,20;

-- ID‑range (fast)
SELECT * FROM logs
WHERE type='error' AND id < 5000000   -- previous page's smallest ID
ORDER BY id DESC
LIMIT 20;

Preconditions: IDs must be (approximately) continuous; random page jumps are not supported.

Solution 5 – Application‑Layer Cache

Cache the result set in Redis for a short period to avoid hitting the database repeatedly.

# Python example
import redis, json, datetime

def get_user_orders_page(user_id, page, page_size=20):
    cache_key = f"user_orders:{user_id}:page:{page}"
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
    offset = (page-1) * page_size
    query = """
        SELECT o.* FROM user_orders o
        INNER JOIN (
            SELECT id FROM user_orders
            WHERE user_id = %s
            ORDER BY order_time DESC
            LIMIT %s, %s
        ) tmp ON o.id = tmp.id
    """
    result = db.execute(query, (user_id, offset, page_size))
    redis_client.setex(cache_key, datetime.timedelta(minutes=5), json.dumps(result))
    return result

Cache strategy suggestions: permanent cache for hot read‑only data, short TTL for normal queries, read‑write separation.

Solution 6 – Partitioned Table

Partition the order table by month so that queries on a specific time range only scan the relevant partitions.

CREATE TABLE partitioned_orders (
    id BIGINT AUTO_INCREMENT,
    user_id INT NOT NULL,
    amount DECIMAL(10,2),
    order_time DATETIME NOT NULL,
    status VARCHAR(20),
    PRIMARY KEY (id, order_time)
) PARTITION BY RANGE (YEAR(order_time)*100 + MONTH(order_time)) (
    PARTITION p202401 VALUES LESS THAN (202402),
    PARTITION p202402 VALUES LESS THAN (202403),
    PARTITION p202403 VALUES LESS THAN (202404),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);
SELECT * FROM partitioned_orders
WHERE order_time >= '2024-01-01' AND order_time < '2024-02-01'
ORDER BY order_time DESC
LIMIT 100000,20;

4. Real‑World Case: E‑commerce Order System

The order_info table holds 8.5 M rows. A query that orders by create_time and skips 400 000 rows took 12.3 s, scanned 400 020 rows and used a filesort.

EXPLAIN result shows type=ref on idx_user_id, rows=400020, Extra=Using where; Using filesort. The bottleneck is the large offset and the inability to use the index for ordering.

Optimization steps :

Create a composite index (user_id, status, create_time DESC) and drop the old single‑column indexes.

Rewrite the query with delayed join to fetch only the needed IDs first.

Enable MySQL query cache (or application‑level cache) for repeated accesses.

After changes the query runs in 180 ms, scans only 20 rows, and CPU drops from 95 % to 15 %.

5. Monitoring & Tuning

Enable slow‑query logging with a 1‑second threshold and analyse the log with pt‑query‑digest to spot regressions.

SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';
pt-query-digest /var/log/mysql/slow.log > slow_report.txt;

Key metrics after optimisation: QPS ↑ from 50 to 300, average latency ↓ from 2.5 s to 200 ms, CPU peak ↓ from 95 % to 40 %, disk I/O ↓ 70 %.

6. Deep Execution‑Plan Optimisation Techniques

6.1 Index Selection Strategy

Left‑most prefix : composite indexes must start with the most selective column.

Covering index : include all columns needed by the query.

Selectivity : choose columns with high distinct‑value ratios.

Example to compute selectivity:

SELECT
    COUNT(DISTINCT user_id)/COUNT(*) AS user_id_selectivity,
    COUNT(DISTINCT status)/COUNT(*)   AS status_selectivity,
    COUNT(DISTINCT DATE(create_time))/COUNT(*) AS create_time_selectivity
FROM order_info;

6.2 Query‑Rewrite Tricks

Avoid SELECT *; list only required columns.

Replace OR conditions with UNION ALL to let the optimizer use indexes.

Rewrite long IN lists as EXISTS joins.

6.3 Database Parameter Tuning

# my.cnf snippet
[mysqld]
sort_buffer_size   = 8M
join_buffer_size   = 4M
tmp_table_size     = 64M
max_heap_table_size= 64M
open_files_limit   = 65535
innodb_buffer_pool_size = 4G   # ~70‑80% of RAM
innodb_log_file_size   = 512M
innodb_flush_log_at_trx_commit = 2   # balance performance and safety

7. Pagination‑Strategy Selection Guide

Choose the method that matches your workload:

Delayed Join – universal, high gain, SQL a bit complex.

Cursor Pagination – best for mobile infinite scroll, O(1) but no random jumps.

Covering Index – excellent when query fields are few, but index size grows.

ID‑Range – simple, works when IDs are continuous.

Business Cache – reduces DB load for read‑heavy scenarios, consistency challenges.

Partitioned Table – cuts scan range for time‑series data, adds management overhead.

8. Core Principles & Implementation Steps

Diagnose with EXPLAIN to locate full scans and filesorts.

Design appropriate indexes (left‑most, covering, selective).

Rewrite SQL using delayed join, cursor, or range techniques.

Consider architectural aids: read‑write splitting, caching layers.

Tune MySQL parameters to match workload.

Set up monitoring (slow‑query log, QPS, latency, CPU) and iterate.

9. Common Pitfalls

Over‑indexing – each index adds write overhead.

Implicit type conversion – can disable index usage.

NULL values – may prevent index lookup.

Neglecting periodic table analysis and index maintenance.

Performance optimizationSQLindexingMySQLdeep pagination
Architecture & Thinking
Written by

Architecture & Thinking

🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.