Databases 16 min read

Understanding MySQL Index Merge Optimization and Its Performance Implications

This article explains the MySQL Index Merge access method, details the underlying merge‑sort algorithm, enumerates the three index‑merge strategies, analyzes source‑code snippets, presents a real‑world case where index merge hurts performance, and offers practical optimization recommendations.

Aikesheng Open Source Community

Aug 20, 2024

Understanding MySQL Index Merge Optimization and Its Performance Implications

In production MySQL databases, queries often combine a regular index equality, a primary‑key range, and an ORDER BY LIMIT clause, which can cause the optimizer to choose an index‑merge plan even when a single index would be faster. This article investigates that behavior.

Index Merge Overview

The official definition states that the Index Merge access method retrieves rows with multiple range scans and merges their results into one. For a single table the optimizer usually picks one index, but with index merge it can use several indexes and combine their results.

Note: The optimizer first obtains primary keys via secondary indexes, merges them in order, and then performs a table lookup to avoid random I/O.

Merge Sort Algorithm

Index merge relies on the merge‑sort algorithm, a classic divide‑and‑conquer technique that recursively splits a list, sorts sub‑lists, and merges them into a sorted list. This avoids an extra sorting step when the input lists are already ordered.

MySQL Index Merge Types

index_merge_intersection : performs an intersection of multiple indexes; the execution plan shows Extra: Using intersect(...) and corresponds to the source class QUICK_ROR_INTERSECT_SELECT.

index_merge_union : performs a union of multiple indexes; the plan shows Extra: Using union(...) and maps to QUICK_ROR_UNION_SELECT.

index_merge_sort_union : performs a sorted union; the plan shows Extra: Using sort_union(...) and maps to QUICK_INDEX_MERGE_SELECT. This variant may need an additional sort because the result set is not pre‑ordered.

For the first two variants, the optimizer relies on the Rowid‑Ordered Retrieval (ROR) property: the secondary index returns row IDs in primary‑key order, allowing a simple merge without extra sorting.

Key Conditions for Index Merge

All columns of a secondary index appear in the WHERE clause with equality predicates (e.g., a=? AND b=? AND c=? for an index idx(a,b,c)).

A primary‑key range condition is present.

Source‑Code Insight

The function QUICK_INDEX_MERGE_SELECT::read_keys_and_merge() scans all relevant secondary indexes, merges their row IDs, skips rows already covered by a primary‑key range scan, and inserts remaining keys into a tree structure for de‑duplication and ordering. A simplified excerpt:

sql opt_range.cc
int QUICK_INDEX_MERGE_SELECT::read_keys_and_merge() {
  // skip row if it will be retrieved by clustered PK scan
  if (pk_quick_select && pk_quick_select->row_in_ranges())
    continue;
  cur_quick->file->position(cur_quick->record);
  result = unique->unique_add((char*)cur_quick->file->ref);
  if (result)
    DBUG_RETURN(1);
}

The unique_add routine inserts elements into a balanced tree, ensuring no duplicates.

Practical Case Study

A query

SELECT * FROM t WHERE c1='d' AND id>=600000 ORDER BY id LIMIT 1000

triggers an index‑merge plan ( idx_c1,PRIMARY) that scans ~27k rows and incurs a filesort, taking ~170 ms. Removing the ORDER BY clause lets MySQL still use index merge but eliminates the filesort, reducing execution time to ~10 ms.

Forcing the secondary index with FORCE INDEX(idx_c1) changes the plan to a simple range scan on idx_c1, achieving the same 10 ms runtime.

Optimization Recommendations

Assess whether the ORDER BY clause is necessary; if the secondary index already returns rows in primary‑key order, it can be omitted.

Modify the primary‑key range condition (e.g., use id+0>=) to encourage the optimizer to choose the secondary index directly.

Note: This approach depends on the specific indexes used; if the secondary index does not guarantee primary‑key ordering, the result set may not be sorted.

Conclusion

MySQL typically selects a single index, but when the WHERE clause satisfies both secondary‑index equality and a primary‑key range, the optimizer may employ index merge. Understanding the underlying merge‑sort mechanism and the optimizer’s cost‑based rechecking (especially the LOW_LIMIT case) helps developers decide when to keep or avoid index‑merge plans for optimal performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql SQL Optimization Database Performance merge sort index merge

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.