Understanding ON vs WHERE in MySQL LEFT JOIN and DBLE Execution Plans
This article explains the difference between ON and WHERE clauses in MySQL LEFT JOIN, illustrates why a proposed optimization for DBLE sharding does not work, and provides practical examples and pseudo‑code to clarify how MySQL processes left joins.
Background : The author, a DBLE development team member, presents a DBLE execution plan for a LEFT JOIN query that includes a sharding column filter in the ON clause, and investigates why routing the query to a single node (Suppose‑1) is not feasible.
ON and WHERE Definitions
ON : The join condition applied while forming the Cartesian product of two tables, creating a virtual table V.
WHERE : The filter applied to the virtual table V after the join, retaining only rows that satisfy the condition.
MySQL LEFT JOIN Behavior
Using an Employee (left) and Info (right) table, the article shows three result scenarios:
Result‑1 : All rows from the left table are displayed; matching rows from the right table are shown, non‑matching rows are NULL.
Result‑2 : Adding an ON condition (e.g., b.country = 'China' ) still shows all left rows, but right‑side columns become NULL when the ON condition fails.
Result‑3 : Adding a WHERE condition (e.g., b.country = 'China' ) filters the virtual table, returning only rows that satisfy the condition.
Why Supposed Optimization Fails
In LEFT JOIN, the left table must always be fully displayed. Supposing the query could be routed to a single node based on a.id = 2 would violate the ON clause semantics, because rows that do not meet the ON condition must still appear with NULLs on the right side. Therefore, the DBLE Explain plan that distributes the sharding table across nodes is correct.
Practical Example and Pseudo‑Code
Data preparation:
create table A(c1 int, c2 int); create table B(c3 int, c4 int);
insert into A values(1,1),(22,22),(333,333); insert into B values(0,1),(1,1);MySQL uses a nested‑loop join algorithm for LEFT JOIN. The following pseudo‑code illustrates the process:
for (leftRow in leftTable) {
boolean matched = false;
for (rightRow in rightTable) {
if (ON condition satisfied) {
print(leftRow + rightRow);
matched = true;
}
}
if (!matched) {
print(leftRow + NULL);
}
}In the provided diagrams, when the ON condition is always true (e.g., A.c1 > 0 ), the result is the Cartesian product. When the ON condition is selective (e.g., A.c1 = 333 ), only matching rows are combined; non‑matching left rows are paired with NULL.
Conclusion
The article reinforces that ON clauses affect how rows are paired, while WHERE clauses filter the resulting virtual table, and that LEFT JOIN always preserves all left‑table rows, making the Supposed‑1 optimization invalid for such queries.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.