Why Does an OR Between Two Indexed Columns Still Trigger a Full Table Scan?
Even though the phone and email columns each have a single‑column index, an OR condition forces MySQL's cost‑based optimizer to choose a full table scan because the estimated cost of index merge (random I/O and possible sort‑union) exceeds the cost of a sequential scan, and the article explains the underlying mechanics and practical workarounds.
When a query uses OR between two indexed columns, such as phone = '13800000000' OR email = '[email protected]', MySQL often reports type = ALL in EXPLAIN, indicating a full table scan despite the presence of single‑column indexes.
MySQL Optimizer Cost Calculation
MySQL employs a cost‑based optimizer that estimates the cost of each possible execution path and selects the cheapest one. For a single‑column equality condition, the optimizer can use the secondary index to locate primary keys quickly and then perform a back‑table lookup.
With an OR condition, the query semantics become a union of two result sets. If the optimizer chooses only the phone index, it must still scan the entire table to find rows satisfying the email predicate, leading to a mixed path: index scan + back‑table lookup + full table scan.
Random I/O (back‑table lookup) is weighted several times higher than sequential I/O (full scan) in MySQL's cost model. When the back‑table portion grows, the estimated cost of the mixed path (Path A) quickly exceeds that of a pure full scan (Path B), so the optimizer prefers Path B.
Why Index Merge Does Not Activate
MySQL does have an index_merge_union algorithm that scans each secondary index, merges the primary‑key sets with a two‑pointer O(N) algorithm, and then performs an ordered back‑table lookup. However, this algorithm requires both index scans to produce naturally ordered primary‑key lists, which only occurs for pure equality predicates.
When a range predicate (e.g., phone LIKE '138%' or phone > '138') is involved, the primary‑key order becomes interleaved, forcing a costly Sort‑Union step. The additional CPU and memory overhead often makes the optimizer abandon index merge in favor of a full scan.
Other hard‑fail cases include implicit type conversion (e.g., comparing a VARCHAR phone column to a numeric literal) and the presence of unindexed columns (e.g., age = 18) that force a full scan regardless of other indexes.
Practical Workarounds Used by Large‑Scale Systems
Replace OR with UNION/UNION ALL
Split the original query into two independent sub‑queries and combine them with UNION or UNION ALL:
SELECT * FROM users WHERE phone = '13800000000'
UNION ALL
SELECT * FROM users WHERE email = '[email protected]';This approach lets each sub‑query use its respective index without the optimizer having to estimate a complex merge cost. UNION ALL is preferred when duplicate rows cannot occur, as it avoids the extra sorting and deduplication overhead of UNION.
Covering Index
If the query only needs columns that are already present in the secondary indexes (e.g., id, phone, email), a covering index eliminates the back‑table lookup entirely. The optimizer can then use index_merge_union with only sequential I/O, guaranteeing index usage and avoiding a full scan.
Conclusion
The core of SQL optimization is to simplify the execution path and make its cost predictable. OR‑based queries often trigger a full table scan because the random I/O cost of back‑table lookups outweighs sequential I/O. Rewriting such queries with UNION ALL or designing covering indexes provides deterministic, index‑driven execution and improves performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer XiaoFu
xiaofucode.com – a programmer learning guide driven by the pursuit of profit
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
