Didi Interview: JOIN Order Optimization and the ‘Small Table Drives Large Table’ Principle
This article explains the JOIN order optimization technique known as the ‘small table drives large table’ principle, demonstrates its impact with a real e‑commerce query case, and reviews various MySQL join algorithms, their trade‑offs, and practical tuning tips.
Problem: A query joining users and orders in an e‑commerce system took 600‑800 ms, exceeding the 200 ms latency target.
Diagnosis: Tracing with SkyWalking showed a total request time of 666 ms, of which the SQL statement consumed 585 ms. The execution plan used an Index Nested‑Loop Join with users as the driver table. After filtering on registration_channel='APP' and city='北京', the driver still produced about 1 M rows, causing roughly 1 M index lookups on orders.
Root cause: The driver table remained large after filtering, leading to a high number of index lookups on the larger table.
Optimization approaches
Programmatic step‑by‑step
Query orders for the needed statuses (≈50 k rows).
Deduplicate user_id (≈30 k IDs) and fetch the corresponding users rows.
Join the two result sets in application memory.
SQL rewrite using the “small table drives large table” principle
SELECT u.*
FROM (
SELECT user_id
FROM orders
WHERE order_status IN (1, 2)
) t
JOIN users u ON t.user_id = u.user_id
WHERE u.registration_channel = 'APP' AND u.city = '北京';Result: Query latency dropped from >600 ms to under 100 ms, meeting the performance requirement.
General JOIN optimization rules
Identify the true “small” table after applying WHERE filters.
Drive the join with the small table and ensure the driven (inner) table has an index on the join column.
Avoid full table scans; the type column in EXPLAIN should show ref or eq_ref, and Extra should show Using index rather than Using filesort.
Set a slow‑query threshold to catch regressions:
SET GLOBAL long_query_time = 0.2;JOIN algorithms in MySQL
Index Nested‑Loop Join
The driver table rows are read sequentially; for each row the inner table is accessed via an index lookup. This is efficient when the driver set is small and the inner table has a suitable index.
-- Example
SELECT *
FROM orders o
JOIN users u ON o.user_id = u.id;
-- The driver (orders) provides a user_id, which is looked up in the indexed users.id column.Simple Nested‑Loop Join
A brute‑force double loop that compares every driver row with every inner row (O(M·N)). Suitable only for very small tables because it incurs massive I/O.
Block Nested‑Loop Join
Driver rows are read in blocks (e.g., 100 rows) and kept in memory; the inner table is scanned once per block, reducing the number of inner scans. The block size is controlled by join_buffer_size (default 256 KB).
Batched Key Access (BKA)
Introduced in MySQL 8.0. The engine collects a batch of join keys from the driver, sorts them, and performs a bulk index lookup on the inner table, turning random I/O into sequential I/O. Enable with:
SET SESSION optimizer_switch='mrr=on,batched_key_access=on';Typical performance gain (random‑I/O scenario):
Index Nested‑Loop: 1200 ms
BKA: 150 ms
Hash Join
The smaller side is built into an in‑memory hash table; the larger side probes this hash table. It is fast when the hash fits in memory, but may spill to disk and degrade performance if the hash is large.
Merge Join
Both tables must be sorted on the join keys. The engine merges the two sorted streams, similar to a merge sort. Sorting cost can be high, but the join itself is less sensitive to table order.
Choosing the right algorithm
Nested‑Loop : most sensitive to join order; keep the smallest filtered table as the driver.
Hash Join : ensure the build side (usually the smaller table) fits in memory; otherwise performance may drop.
Merge Join : beneficial when both tables are already ordered on the join key.
When manual intervention is needed
Even though modern optimizers use statistics to choose join order, intervene when:
Statistics are stale or missing (e.g., after bulk data loads).
Complex multi‑table joins cause the optimizer to pick a sub‑optimal order.
Indexes exist but are not being used (e.g., missing index hints).
Typical hints:
-- Force driver order
SELECT /*+ STRAIGHT_JOIN */ *
FROM small_table
JOIN large_table ON ...;Or enable BKA/Multiple‑Range‑Read:
SET SESSION optimizer_switch='mrr=on,batched_key_access=on';Key take‑aways
Apply the “small table drives large table” rule after WHERE filtering, not based on raw table size.
Ensure the driven table has an appropriate index on the join column.
Choose the join algorithm that matches data distribution and memory availability (Nested‑Loop → BKA → Hash → Merge).
Monitor execution plans ( EXPLAIN) and adjust statistics or hints when the optimizer makes a poor choice.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
