Why Is MySQL OFFSET Slow? B+ Tree Index Insights & K‑th Largest Query Optimization
This article examines how MySQL uses secondary indexes to find the K‑th largest value, analyzes the O(N) time complexity caused by offset pagination, explains the underlying B+‑tree double‑linked list behavior, and presents three optimization strategies—including index covering, pagination redesign, and boundary prediction—to dramatically improve performance.
Problem Overview
We need to determine the time complexity of finding the K‑th largest number using a secondary index in MySQL.
Approach
Align the SQL statement with the interviewer's intent.
Analyze the time‑complexity of the query.
Identify potential performance bottlenecks.
Propose multiple optimization techniques.
SQL Alignment
Ask the interviewer about table schema and index definition. Assume a simplified table t_player with columns id (primary key), score (indexed), and name.
Only a secondary index on score (ascending) exists. To get the K‑th largest value we can use:
SELECT * FROM t_player ORDER BY score DESC OFFSET k-1 LIMIT 1;Complexity Analysis
The core question is the time complexity of the above statement. Although a secondary index suggests O(log N) for point lookups, retrieving the K‑th largest requires scanning the index because the tree branches cannot be predicted. In practice the operation behaves like O(N).
MySQL must traverse the B+‑tree and then perform a “back‑table” lookup for each row, leading to many random I/Os when OFFSET is large (e.g., > 10 000).
Optimization Strategies
Three typical solutions:
Business‑level pagination : Replace LIMIT … OFFSET … with “next page” logic, allowing the engine to use the index directly.
Index covering : Ensure all required columns are present in the secondary index so the engine can avoid back‑table lookups. Example queries:
-- Record previous score
SELECT score FROM t_player ORDER BY score DESC LIMIT 20;
-- Next page
SELECT score FROM t_player WHERE score < prev_score ORDER BY score DESC LIMIT 20;Boundary prediction : Use business knowledge to limit the search range, as described in “High Performance MySQL”.
Offset Performance Issue
When OFFSET exceeds 10 000, MySQL performs about 10 000 random I/Os because each skipped row still triggers a back‑table lookup.
Practical Test
On a 5 million‑row table, a query with OFFSET 10 000 without index covering can take seconds, while adding covering reduces it to milliseconds.
Why MySQL Doesn’t Push Down OFFSET
Two reasons: (1) the variety of scenarios makes a generic implementation costly, and (2) offset handling belongs to the SQL layer, not the storage engine.
Community Patches
Alibaba Cloud and Tencent Cloud have implemented index‑push‑down patches that reduce the execution time of offset queries by up to 75×. The patch adds a function in the engine to skip rows based on offset without back‑table lookups.
Takeaways
The seemingly simple “find K‑th largest” question actually touches B+‑tree structure, index‑covering, pagination design, and MySQL’s layered architecture. Understanding these concepts helps answer interview questions and write high‑performance SQL.
NiuNiu MaTe
Joined Tencent (nicknamed "Goose Factory") through campus recruitment at a second‑tier university. Career path: Tencent → foreign firm → ByteDance → Tencent. Started as an interviewer at the foreign firm and hopes to help others.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
