Databases 19 min read

Why MySQL LIMIT + ORDER BY Returns Unexpected Rows – Priority Queue Threshold

This article investigates the nondeterministic behavior of MySQL when combining ORDER BY with LIMIT, reproduces the official example, identifies a critical row count threshold where the optimizer switches to a priority‑queue (heap) sort, and explains the underlying source‑code logic and its impact on query results.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Why MySQL LIMIT + ORDER BY Returns Unexpected Rows – Priority Queue Threshold

What is the side task?

When discussing pagination, the LIMIT keyword inevitably appears. The MySQL documentation on limit optimization provides a detailed example that shows how ORDER BY combined with LIMIT can produce nondeterministic results.

https://dev.mysql.com/doc/refman/8.0/en/limit-optimization.html

The example uses a table called ratings. Without LIMIT the query returns all rows ordered by category:

With LIMIT 5 the official result differs:

To reproduce the behavior locally, the author created the table and inserted the same seven rows:

CREATE TABLE `ratings` (
  `id` int NOT NULL AUTO_INCREMENT,
  `category` int DEFAULT NULL,
  `rating` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

INSERT INTO `ratings`(`id`,`category`,`rating`) VALUES
(1,1,'4.5'),
(2,3,'5.0'),
(3,2,'3.7'),
(4,2,'3.5'),
(5,1,'3.2'),
(6,2,'3.5'),
(7,3,'2.7');

Running SELECT * FROM ratings ORDER BY category; matches the documentation. However, SELECT * FROM ratings ORDER BY category LIMIT 5; yields a different order, confirming the nondeterminism.

Forceful Miracle

Confident that the documentation was correct, the author inserted many rows with the same category value to trigger the issue. After inserting 20 rows with category = 2, the LIMIT 5 query again produced an unexpected row (ID 27) near the top.

By gradually adding rows, the author discovered a critical count at which the result changes: when the table contains 16 rows, the query returns IDs 1,5,16,3,4 instead of 1,5,3,4,6.

Using optimizer_trace

The optimizer_trace feature records the optimizer’s decision process. Enabling it around the query shows that the filesort_priority_queue_optimization field switches from chosen: false (when there are 15 rows) to chosen: true (when there are 16 rows).

SET optimizer_trace='enabled=on';
SELECT * FROM ratings ORDER BY category LIMIT 5;
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace='enabled=off';

Finding the source code

The relevant source file is sql/filesort.cc in the MySQL server repository. The function check_if_pq_applicable decides whether to use a priority‑queue (heap) for the top‑N operation.

https://github.com/mysql/mysql-server/blob/trunk/sql/filesort.cc

The function checks several conditions, the most important being:

if (param->max_rows < num_rows / PQ_slowness)
    // use priority queue

Here max_rows is the LIMIT value, num_rows is the total number of rows that satisfy the ORDER BY, and PQ_slowness is a constant equal to 3. With LIMIT 5, the expression becomes true only when num_rows > 15, which explains the observed threshold at 16 rows.

Sort modes

The MySQL optimizer can use three sort buffer formats:

<sort_key, rowid> : stores the sort key and a pointer to the original row.

<sort_key, additional_fields> : stores the sort key plus all columns needed for the query, avoiding a table lookup.

<sort_key, packed_additional_fields> : like the previous mode but packs variable‑length columns tightly to save space.

These modes are chosen based on the query and buffer size.

Heap sort demonstration

Using the priority‑queue logic is equivalent to building a min‑heap of size LIMIT. The article walks through inserting rows, maintaining the heap, and extracting the top‑N IDs, showing how ID 16 moves ahead of IDs 3 and 4 when the heap is used.

The heap sort is not stable; rows with identical sort keys can change relative order, which explains why the ID 16 appears before IDs 3 and 4.

Conclusion

The nondeterministic result of ORDER BY … LIMIT stems from the optimizer’s decision to use a priority‑queue when the row count exceeds a threshold derived from LIMIT and the constant PQ_slowness. Understanding this behavior helps developers predict query results and tune MySQL settings.

For deeper details, refer to the official MySQL documentation on limit optimization:

https://dev.mysql.com/doc/refman/8.0/en/limit-optimization.html
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

query optimizationmysqlpriority-queueOrder Byheap sortOptimizer_trace
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.