How to Diagnose and Optimize a Slow MySQL Join Query: From Explain to Indexes
This article walks through a real-world slow‑SQL case, explains the inner workings of Join and Order‑by algorithms, demonstrates using EXPLAIN and optimizer_trace to trace execution, and shows how adding indexes and choosing the right join method can dramatically improve MySQL query performance.
Background
Slow SQL queries degrade user experience and can even crash servers. The author discovered an online query that took minutes to run. The original statement (simplified) is:
select t1.*, t2.x from t_table1 t1 left join t_table2 t2 on t1.a = t2.a order by t1.c desc;The involved tables have the following structures (simplified):
CREATE TABLE `t_table1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`a` varchar(64) NOT NULL,
`b` varchar(64) NOT NULL,
`c` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `t_table2` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`a` varchar(64) NOT NULL,
`x` varchar(64) NOT NULL,
`y` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;Key parameters: MySQL 5.6.x, t_table1 has 3,000 rows, t_table2 has 70,000 rows.
Running EXPLAIN shows both tables are scanned fully, and the plan includes
Using temporary; Using filesort; Using join buffer (Block Nested Loop). The author includes the EXPLAIN image.
Principle Exploration
Join algorithm basics
The table scanned first is the driving table ; the other is the driven table . When no semantic constraints exist, the optimizer picks the table that yields the lowest cost, usually the smaller one.
Two join algorithms are discussed:
Block Nested‑Loop Join (BNL) : The driver table is read in chunks (K rows) and stored in a join buffer; the driven table is scanned fully for each chunk. Cost ≈ m + λ·m·n (λ∈(0,1)).
Index Nested‑Loop Join (NLJ) : For each driver row, the optimizer uses an index on the driven table’s join key, turning the scan into an index lookup (≈ log₂ n). Cost ≈ m + 2·m·log₂ n.
Because the cost is dominated by the size of the driver table (m), the optimizer prefers the smaller table as driver (“small table drives large table”).
Order‑by algorithm basics
MySQL allocates a sort buffer per thread. If the buffer is insufficient, it spills to a temporary file, which is much slower. Two sorting methods are described:
Full‑field sort : All selected columns are stored and sorted.
Rowid sort : Only the sort key and primary key are stored; after sorting, the primary keys are used to fetch full rows (requires a “back‑table” lookup).
Rowid sort reduces memory usage but adds extra disk reads for the back‑table lookup.
Optimization Process
Based on the principles, the author hypothesizes the execution steps:
Join using BNL (as shown by Using join buffer (Block Nested Loop)).
Store the join result in a temporary table ( Using temporary).
Sort the temporary table ( Using filesort).
Optimizer_trace confirms the plan and provides cost estimates: scanning t_table1 costs 615, scanning t_table2 repeatedly costs ~4.19 × 10⁷, and sorting costs ~2.1 × 10⁸.
Actual execution details (from Optimizer_trace) show:
The temporary table is an in‑memory table.
Sorting 3,000 rows used the Rowid sort and did not spill to disk.
Thus, although the optimizer predicts sorting to be the biggest cost, the real bottleneck is the BNL algorithm repeatedly scanning the large driven table.
To fix this, the author adds an index on the join column of t_table2, converting the join to an Index Nested‑Loop Join. After the change, monitoring shows the query latency drops below 20 ms.
Conclusion
The article demonstrates a complete SQL‑performance‑tuning workflow: understand the semantics of key clauses (JOIN, ORDER BY), use diagnostic tools (EXPLAIN, optimizer_trace), verify hypotheses with real execution data, and apply targeted changes such as adding indexes to let the optimizer choose a more efficient join algorithm.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
