Databases 21 min read

Understanding MySQL Sorting Modes, External Sorting, and Optimization Techniques

This article comprehensively explains MySQL's sorting mechanisms, including various sorting modes, index-based optimizations, external sorting algorithms, trace analysis, and configuration parameters, providing practical guidance for developers to diagnose and improve query performance.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding MySQL Sorting Modes, External Sorting, and Optimization Techniques

Introduction

The article addresses common MySQL sorting questions, explains how to identify when MySQL performs sorting (e.g., Using filesort in the Extra column of EXPLAIN), and outlines the impact of sorting on query performance.

Index‑Based Sorting Optimization

Using B‑tree indexes can avoid extra sorting. For example:

select * from film where actor_name='苍老师' order by prod_time;

Creating a composite index on (actor_name, prod_time) allows MySQL to retrieve rows in the required order without a filesort.

Typical index‑friendly query patterns are illustrated:

SELECT * FROM t1
ORDER BY key_part1,key_part2,... ;

SELECT * FROM t1
WHERE key_part1 = constant
ORDER BY key_part2;

SELECT * FROM t1
ORDER BY key_part1 DESC, key_part2 DESC;

SELECT * FROM t1
WHERE key_part1 = 1
ORDER BY key_part1 DESC, key_part2 DESC;

SELECT * FROM t1
WHERE key_part1 > constant
ORDER BY key_part1 ASC;

SELECT * FROM t1
WHERE key_part1 < constant
ORDER BY key_part1 DESC;

SELECT * FROM t1
WHERE key_part1 = constant1 AND key_part2 > constant2
ORDER BY key_part2;

Sorting Modes Overview

MySQL supports three sorting modes, identified by the sort_mode field in the optimizer trace: <sort_key, rowid> – original mode (pre‑5.1). <sort_key, additional_fields> – modified mode (5.1+). <sort_key, packed_additional_fields> – packed‑data mode (5.7.3+).

Each mode differs in how key values and row identifiers are stored in the sort buffer.

4.2.1 Return‑Table Sorting (回表排序模式)

Keys and row IDs are stored; if the buffer overflows, temporary files are created.

After sorting, MySQL reads rows by row ID, causing random I/O; it mitigates this by batching reads into read_rnd_buffer_size chunks.

4.2.2 No‑Return‑Table Sorting (不回表排序模式)

Keys and required output columns are stored together, eliminating the need for a second table lookup.

Temporary files contain the final sorted rows directly.

4.2.3 Packed‑Data Sorting (打包数据排序模式)

Variable‑length columns (e.g., VARCHAR) are stored in a compact form, reducing memory usage dramatically.

External Sorting

When the sort buffer cannot hold all keys, MySQL falls back to external sorting.

5.1 Two‑Way External Sort

Read a chunk (e.g., 100 MB) into memory, sort with quicksort, write the sorted chunk to disk.

Repeat until all data are chunked.

Merge the sorted chunks (e.g., 9‑way merge) into the final result.

5.2 Multi‑Way External Sort

For very large datasets, MySQL merges groups of chunks (e.g., 25 at a time) to reduce the number of merge passes.

MySQL’s Internal External‑Sort Algorithm

The process (illustrated with code) works as follows:

int merge_many_buff(Sort_param *param, Sort_buffer sort_buffer,
    Merge_chunk_array chunk_array,
    size_t *p_num_chunks, IO_CACHE *t_file)
{
    ...
    for (i=0 ; i < num_chunks - MERGEBUFF * 3 / 2 ; i+= MERGEBUFF)
    {
        if (merge_buffers(param, from_file, to_file,
            sort_buffer, last_chunk++,
            Merge_chunk_array(&chunk_array[i], MERGEBUFF), 0))
            goto cleanup;
    }
    ...
}

Each call to merge_buffers increments sort_merge_passes:

int merge_buffers(Sort_param *param, IO_CACHE *from_file,
    IO_CACHE *to_file, Sort_buffer sort_buffer,
    Merge_chunk *last_chunk,
    Merge_chunk_array chunk_array,
    int flag)
{
    ...
    current_thd->inc_status_sort_merge_passes();
    ...
}

Sort_merge_passes

The variable counts how many merge passes were required; a high value indicates that sort_buffer_size is too small for the dataset. Increasing sort_buffer_size or reducing key size lowers the count.

Trace Result Interpretation

Key fields in the optimizer trace include:

"attaching_conditions_to_tables": {
    "original_condition": "((`film`.`Producer` like '东京热%') and (`film`.`prod_time` > '2015-12-01'))",
    "attached_conditions_summary": [{"table": "`film`", "attached": "((`film`.`Producer` like '东京热%') and (`film`.`prod_time` > '2015-12-01'))"}]
}

"join_execution": {
    "select#": 1,
    "steps": [{
        "filesort_information": [{"direction": "asc", "table": "`film`", "field": "actor_age"}],
        "filesort_priority_queue_optimization": {"usable": false, "cause": "not applicable (no LIMIT)"},
        "filesort_summary": {"rows": 1, "examined_rows": 5, "number_of_tmp_files": 0, "sort_buffer_size": 261872, "sort_mode": "<sort_key, packed_additional_fields>"}
    }]
}

These entries confirm that MySQL performed an ascending sort on actor_age using the packed‑data mode without temporary files.

Other Relevant Parameters max_sort_length – allocates a fixed byte size for each sort key when the size cannot be determined, potentially causing excess memory usage. innodb_disable_sort_file_cache – disables OS file‑system caching for sort temporary files (similar to O_DIRECT). innodb_sort_buffer_size – controls the buffer size used when building InnoDB indexes, not directly related to query sorting.

Optimization Checklist

Select only needed columns; avoid SELECT * and use LIMIT when possible.

Avoid functions on indexed columns that force MySQL to allocate max_sort_length.

Leverage appropriate indexes to eliminate sorting.

Increase sort_buffer_size to reduce external sorting.

If the original sort algorithm is unavoidable, raise read_rnd_buffer_size to speed up row retrieval.

Define column lengths sensibly to avoid unnecessary space.

Place tmpdir on fast storage.

References

MySQL 5.7 Order‑by Optimization documentation, coding‑geek.com article on database internals, and various MySQL source files (e.g., sql/filesort.cc).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseMySQLsortingexternal sorting
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.