Databases 12 min read

How to Diagnose and Optimize a Slow MySQL Join Query: From Explain to Indexes

This article walks through a real-world slow‑SQL case, explains the inner workings of Join and Order‑by algorithms, demonstrates using EXPLAIN and optimizer_trace to trace execution, and shows how adding indexes and choosing the right join method can dramatically improve MySQL query performance.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
How to Diagnose and Optimize a Slow MySQL Join Query: From Explain to Indexes

Background

Slow SQL queries degrade user experience and can even crash servers. The author discovered an online query that took minutes to run. The original statement (simplified) is:

select t1.*, t2.x from t_table1 t1 left join t_table2 t2 on t1.a = t2.a order by t1.c desc;

The involved tables have the following structures (simplified):

CREATE TABLE `t_table1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  `a` varchar(64) NOT NULL,
  `b` varchar(64) NOT NULL,
  `c` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `t_table2` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  `a` varchar(64) NOT NULL,
  `x` varchar(64) NOT NULL,
  `y` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Key parameters: MySQL 5.6.x, t_table1 has 3,000 rows, t_table2 has 70,000 rows.

Running EXPLAIN shows both tables are scanned fully, and the plan includes

Using temporary; Using filesort; Using join buffer (Block Nested Loop)

. The author includes the EXPLAIN image.

Principle Exploration

Join algorithm basics

The table scanned first is the driving table ; the other is the driven table . When no semantic constraints exist, the optimizer picks the table that yields the lowest cost, usually the smaller one.

Two join algorithms are discussed:

Block Nested‑Loop Join (BNL) : The driver table is read in chunks (K rows) and stored in a join buffer; the driven table is scanned fully for each chunk. Cost ≈ m + λ·m·n (λ∈(0,1)).

Index Nested‑Loop Join (NLJ) : For each driver row, the optimizer uses an index on the driven table’s join key, turning the scan into an index lookup (≈ log₂ n). Cost ≈ m + 2·m·log₂ n.

Because the cost is dominated by the size of the driver table (m), the optimizer prefers the smaller table as driver (“small table drives large table”).

Order‑by algorithm basics

MySQL allocates a sort buffer per thread. If the buffer is insufficient, it spills to a temporary file, which is much slower. Two sorting methods are described:

Full‑field sort : All selected columns are stored and sorted.

Rowid sort : Only the sort key and primary key are stored; after sorting, the primary keys are used to fetch full rows (requires a “back‑table” lookup).

Rowid sort reduces memory usage but adds extra disk reads for the back‑table lookup.

Optimization Process

Based on the principles, the author hypothesizes the execution steps:

Join using BNL (as shown by Using join buffer (Block Nested Loop)).

Store the join result in a temporary table ( Using temporary).

Sort the temporary table ( Using filesort).

Optimizer_trace confirms the plan and provides cost estimates: scanning t_table1 costs 615, scanning t_table2 repeatedly costs ~4.19 × 10⁷, and sorting costs ~2.1 × 10⁸.

Actual execution details (from Optimizer_trace) show:

The temporary table is an in‑memory table.

Sorting 3,000 rows used the Rowid sort and did not spill to disk.

Thus, although the optimizer predicts sorting to be the biggest cost, the real bottleneck is the BNL algorithm repeatedly scanning the large driven table.

To fix this, the author adds an index on the join column of t_table2, converting the join to an Index Nested‑Loop Join. After the change, monitoring shows the query latency drops below 20 ms.

Conclusion

The article demonstrates a complete SQL‑performance‑tuning workflow: understand the semantics of key clauses (JOIN, ORDER BY), use diagnostic tools (EXPLAIN, optimizer_trace), verify hypotheses with real execution data, and apply targeted changes such as adding indexes to let the optimizer choose a more efficient join algorithm.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

IndexingMySQLSQL OptimizationEXPLAINOrder ByJoin Algorithms
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.