Databases 8 min read

Understanding MySQL join_buffer_size and Join Algorithms (Nested Loop, Block Nested Loop, Hash Join)

This article explains the purpose and configuration of MySQL's join_buffer_size for INNER JOIN operations without indexed join keys, compares Nested Loop, Block Nested Loop, and the newly introduced Hash Join algorithms, and shows how each affects execution plans and query performance.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding MySQL join_buffer_size and Join Algorithms (Nested Loop, Block Nested Loop, Hash Join)

This article explains the MySQL parameter join_buffer_size and its use in INNER JOIN scenarios where the join key has no index, i.e., full‑table‑scan cases.

join_buffer_size is a memory buffer that caches rows from the outer table for joins that lack indexed keys. It is usually set to a small global value and adjusted per session or query; the default is 512 KB, but it can be temporarily increased, e.g., to 1 GB.

Example session commands:

set session join_buffer_size = 1024 * 1024 * 1024;

select * from ...;

set session join_buffer_size = default;

The article then examines three MySQL join algorithms.

1. Nested‑Loop Join (NLJ) is the simplest algorithm. For a query like select * from p1 join p2 using(r1) , MySQL reads each row from the outer table (p2) and scans the inner table (p1) for matching rows. An example execution plan (JSON format) shows the outer table accessed first, with a high read_cost for the inner table.

explain format=json select * from p1 inner join p2 as b using(r1)\G

The plan indicates that table b (p2) is the driving table, scanning 1 000 rows, while table p1 is scanned about 10 million rows, resulting in a large read cost.

2. Block Nested‑Loop Join (BNLJ) adds a buffer that stores a block of rows from the outer table, reducing the number of scans of the inner table. If the buffer can hold all join‑key values from the outer table, the inner table is scanned only once, improving performance dramatically (e.g., 1 000× faster in the example). The execution plan includes the flag using_join_buffer: "Block Nested Loop" and shows a much lower read_cost (≈5 199 instead of 5 198 505.87).

explain format=json select * from p1 inner join p2 as b using(r1)\G

3. Hash Join was introduced in MySQL 8.0.18. It also uses a join buffer, but builds a hash table on the outer table and probes it with rows from the inner table, further reducing inner‑table matches. The plan (tree format) shows an Inner hash join with a hash built on table b and a table scan on p1 .

explain format=tree select * from p1 inner join p2 as b using(r1)\G

In summary, the article demonstrates how join_buffer_size influences join execution, compares NLJ, BNLJ, and Hash Join, and provides concrete EXPLAIN output to illustrate cost differences.

After the technical discussion, the article includes community announcements: free Mycat diagnostics, a call for technical article submissions on MySQL, DBLE, and DTLE, and contact information for the open‑source community.

PerformanceMySQLnested-loopHash Joinjoin algorithmsblock nested loopjoin_buffer_size
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.