Databases 10 min read

Why MySQL Joins Aren’t Always Slow: Inside the Four Join Algorithms

This article uses a realistic interview scenario to explore why multi-table joins in MySQL are not inherently slow, detailing four join algorithms—simple nested-loop, index nested-loop, block nested-loop, and batched key access—along with their pseudo‑code, execution steps, performance trade‑offs, and the impact of features like MRR and recent optimizer changes.

Senior Tony

Jul 4, 2024

Why MySQL Joins Aren’t Always Slow: Inside the Four Join Algorithms

In a typical interview scenario, a candidate claims that multi‑table joins are used in every project despite concerns about performance. The article explains why joins are not inherently inefficient and presents four MySQL join algorithms that the optimizer may choose.

1. Simple Nested‑Loops Join

This is the most straightforward implementation: for each row from the driving table (A), the engine scans the entire inner table (B) to find matching rows.

for each row in table A matching range {
    for each row in table B matching join column {
        if row satisfies join conditions, send to client
    }
}

Example SQL:

SELECT * FROM product p INNER JOIN `order` o ON p.id = o.product_id;

Execution steps:

Outer loop reads a row from product.

Inner loop scans all rows of order to find matches.

Matching rows are added to the result set.

Steps 1‑3 repeat until all rows of product are processed.

This approach can be extremely slow when the driving table yields many rows because it triggers a full scan of the inner table for each outer row.

2. Index Nested‑Loops Join

If the join column of the inner table (B) has an index, the optimizer can use it to locate matching rows directly, avoiding a full table scan.

for each row in table A matching range {
    use index on B's join column to find matching rows
    if row satisfies join conditions, send to client
}

Execution steps are similar to the simple nested‑loops join, but step 2 uses the index lookup, which dramatically reduces I/O.

3. Block Nested‑Loops Join

This variant buffers rows from the driving table in a join buffer. When the buffer fills, the engine processes the buffered rows against the inner table in bulk, reducing the number of full scans.

for each row in table A matching range {
    store join column in join buffer
    if (join buffer is full) {
        for each row in table B matching join column {
            if row satisfies join conditions, send to client
        }
    }
}

From MySQL 8.0.20 onward, block nested‑loops joins have been replaced by hash joins, so this algorithm is now deprecated.

4. Batched Key Access (BKA) Join

BKA combines the benefits of Multi‑Range Read (MRR) and the join buffer. It batches keys from the driving table, sorts them, and then performs a range read on the inner table, turning random I/O into sequential I/O.

for each row in table A matching range {
    store join column in join buffer
    if (join buffer is full) {
        for each row in table B matching join column {
            send to MRR interface and order by primary key
            if row satisfies join conditions, send to client
        }
    }
}

When the optimizer selects BKA, the EXTRA column of the execution plan shows Using where; Using join buffer (Batched Key Access).

Additional Optimizations

MySQL 5.6 introduced Multi‑Range Read (MRR), which sorts primary‑key IDs obtained from a secondary‑index range scan, converting random disk I/O into sequential I/O and reducing the number of page reads.

From MySQL 8.0.20, the optimizer prefers hash joins over block nested‑loops joins for suitable cases.

Conclusion

The article covered four MySQL join algorithms, their implementation details, and performance considerations. Future topics will compare these with hash joins and with application‑level data merging approaches.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance MySQL Nested Loop Join Algorithms Index Join

Written by

Senior Tony

Former senior tech manager at Meituan, ex‑tech director at New Oriental, with experience at JD.com and Qunar; specializes in Java interview coaching and regularly shares hardcore technical content. Runs a video channel of the same name.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.