Mastering MySQL ORDER BY: Principles, Optimizations, and Index Pitfalls
This article explains how MySQL processes ORDER BY, compares conventional, optimized, and priority‑queue sorting algorithms, shows how to leverage indexes, and provides practical tuning tips such as adjusting max_length_for_sort_data and sort_buffer_size to avoid costly temporary files and random I/O.
1. Introduction
For DBAs and developers who frequently use ORDER BY to satisfy front‑end display requirements, sorting can become a major source of slow queries. This article examines the underlying principles, different sorting algorithms, and practical optimization techniques for MySQL.
2. Sorting Principles
MySQL implements two main sorting methods—conventional and optimized—and, starting with version 5.6, adds a third method for ORDER BY LIMIT M,N that uses a priority‑queue (heap) algorithm.
2.1 Conventional Sorting
The conventional flow consists of the following steps:
Retrieve rows that satisfy the WHERE clause from table t1.
For each row, extract the primary key and the sorting key (e.g., id, col2) into the sort buffer.
If the sort buffer can hold all (id, col2) pairs, sort them in memory; otherwise, sort what fits and spill the remainder to a temporary file using quicksort.
If a temporary file is created, merge‑sort it to keep the file ordered.
Repeat the above until all qualifying rows have been processed.
Scan the sorted (id, col2) pairs and fetch the required columns ( col1, col2, col3) using the primary key.
Return the final result set to the client.
The need for a temporary file depends on the sort_buffer_size setting. Two I/O passes are required: one to read the sorting keys and another to fetch the full rows, which can cause random I/O unless the second pass is optimized by pre‑sorting the primary keys in a buffer controlled by read_rnd_buffer_size.
2.2 Optimized Sorting
Optimized sorting reduces the second I/O pass by placing the full result columns ( col1, col2, col3) into the sort buffer instead of only the key columns. After sorting, MySQL can return rows directly without a second lookup. This approach works only when the total length of the sorting tuple is smaller than the max_length_for_sort_data parameter; otherwise MySQL falls back to conventional sorting.
2.3 Priority‑Queue Sorting
MySQL 5.6 introduces a heap‑based priority‑queue algorithm for ORDER BY LIMIT M,N. The heap stores only the M+N smallest (or largest) tuples, dramatically reducing the required sort buffer space for small M and N. For ascending order a max‑heap is used, and for descending order a min‑heap.
3. Optimization Strategies
Understanding the sorting mechanics leads to two main ways to avoid costly sorts: use the natural order of a B‑tree index or eliminate sorting altogether.
CREATE TABLE t1 (
id INT NOT NULL PRIMARY KEY,
key_part1 INT(10) NOT NULL,
key_part2 VARCHAR(10) NOT NULL DEFAULT '',
key_part3 ...,
KEY idx_kp1_kp2 (key_part1, key_part2, key_part4),
KEY idx_kp3 (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;Queries that can exploit idx_kp1_kp2 include:
SELECT * FROM t1 ORDER BY key_part1, key_part2; SELECT * FROM t1 WHERE key_part1 = constant ORDER BY key_part2; SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 DESC; SELECT * FROM t1 WHERE key_part1 = 1 ORDER BY key_part1 DESC, key_part2 DESC; SELECT * FROM t1 WHERE key_part1 > constant ORDER BY key_part1 ASC; SELECT * FROM t1 WHERE key_part1 < constant ORDER BY key_part1 DESC; SELECT * FROM t1 WHERE key_part1 = const1 AND key_part2 > const2 ORDER BY key_part2;Common situations where MySQL cannot use index sorting:
The index used for filtering ( key2) differs from the index used for ordering ( key1).
Sorting columns belong to different indexes.
The order of columns in the ORDER BY clause does not match the index column order.
Ascending/descending directions in ORDER BY conflict with the index's default order.
A range condition on the first indexed column prevents the second column from using index ordering.
The ORDER BY columns differ from those in a GROUP BY clause.
Hash indexes (which are unordered) cannot provide ordered results.
Only a prefix of a column is indexed (e.g., col(10)).
In join queries, the sorting column does not originate from the first table, causing the optimizer to choose a non‑const access type.
3.1 Tuning Parameters When Sorting Is Unavoidable
Increase max_length_for_sort_data : When the combined length of all returned columns is less than this value, MySQL prefers the optimized sorting algorithm. Raising the parameter allows more rows to be sorted in memory.
Remove Unnecessary Columns : If memory is limited, dropping large or unused columns from the SELECT list reduces the tuple size, helping it fit within max_length_for_sort_data.
Enlarge sort_buffer_size : A larger buffer reduces the number of segments MySQL must create, decreasing temporary file usage and random I/O. However, this is a per‑connection setting; setting it too high can exhaust system memory under high concurrency, and values above ~2 MiB may trigger mmap() instead of malloc(), which can degrade performance.
4. References
MySQL official documentation on ORDER BY optimization.
In‑depth analysis of MySQL sorting algorithms and case studies.
Taobao MySQL monthly report.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
