Why MySQL LIMIT + ORDER BY Returns Unexpected Rows – Priority Queue Threshold
This article investigates the nondeterministic behavior of MySQL when combining ORDER BY with LIMIT, reproduces the official example, identifies a critical row count threshold where the optimizer switches to a priority‑queue (heap) sort, and explains the underlying source‑code logic and its impact on query results.
What is the side task?
When discussing pagination, the LIMIT keyword inevitably appears. The MySQL documentation on limit optimization provides a detailed example that shows how ORDER BY combined with LIMIT can produce nondeterministic results.
https://dev.mysql.com/doc/refman/8.0/en/limit-optimization.html
The example uses a table called ratings. Without LIMIT the query returns all rows ordered by category:
With LIMIT 5 the official result differs:
To reproduce the behavior locally, the author created the table and inserted the same seven rows:
CREATE TABLE `ratings` (
`id` int NOT NULL AUTO_INCREMENT,
`category` int DEFAULT NULL,
`rating` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `ratings`(`id`,`category`,`rating`) VALUES
(1,1,'4.5'),
(2,3,'5.0'),
(3,2,'3.7'),
(4,2,'3.5'),
(5,1,'3.2'),
(6,2,'3.5'),
(7,3,'2.7');Running SELECT * FROM ratings ORDER BY category; matches the documentation. However, SELECT * FROM ratings ORDER BY category LIMIT 5; yields a different order, confirming the nondeterminism.
Forceful Miracle
Confident that the documentation was correct, the author inserted many rows with the same category value to trigger the issue. After inserting 20 rows with category = 2, the LIMIT 5 query again produced an unexpected row (ID 27) near the top.
By gradually adding rows, the author discovered a critical count at which the result changes: when the table contains 16 rows, the query returns IDs 1,5,16,3,4 instead of 1,5,3,4,6.
Using optimizer_trace
The optimizer_trace feature records the optimizer’s decision process. Enabling it around the query shows that the filesort_priority_queue_optimization field switches from chosen: false (when there are 15 rows) to chosen: true (when there are 16 rows).
SET optimizer_trace='enabled=on';
SELECT * FROM ratings ORDER BY category LIMIT 5;
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace='enabled=off';Finding the source code
The relevant source file is sql/filesort.cc in the MySQL server repository. The function check_if_pq_applicable decides whether to use a priority‑queue (heap) for the top‑N operation.
https://github.com/mysql/mysql-server/blob/trunk/sql/filesort.cc
The function checks several conditions, the most important being:
if (param->max_rows < num_rows / PQ_slowness)
// use priority queueHere max_rows is the LIMIT value, num_rows is the total number of rows that satisfy the ORDER BY, and PQ_slowness is a constant equal to 3. With LIMIT 5, the expression becomes true only when num_rows > 15, which explains the observed threshold at 16 rows.
Sort modes
The MySQL optimizer can use three sort buffer formats:
<sort_key, rowid> : stores the sort key and a pointer to the original row.
<sort_key, additional_fields> : stores the sort key plus all columns needed for the query, avoiding a table lookup.
<sort_key, packed_additional_fields> : like the previous mode but packs variable‑length columns tightly to save space.
These modes are chosen based on the query and buffer size.
Heap sort demonstration
Using the priority‑queue logic is equivalent to building a min‑heap of size LIMIT. The article walks through inserting rows, maintaining the heap, and extracting the top‑N IDs, showing how ID 16 moves ahead of IDs 3 and 4 when the heap is used.
The heap sort is not stable; rows with identical sort keys can change relative order, which explains why the ID 16 appears before IDs 3 and 4.
Conclusion
The nondeterministic result of ORDER BY … LIMIT stems from the optimizer’s decision to use a priority‑queue when the row count exceeds a threshold derived from LIMIT and the constant PQ_slowness. Understanding this behavior helps developers predict query results and tune MySQL settings.
For deeper details, refer to the official MySQL documentation on limit optimization:
https://dev.mysql.com/doc/refman/8.0/en/limit-optimization.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
