Databases 8 min read

Understanding ON vs WHERE in MySQL LEFT JOIN and DBLE Execution Plans

This article explains the difference between ON and WHERE clauses in MySQL LEFT JOIN, illustrates why a proposed optimization for DBLE sharding does not work, and provides practical examples and pseudo‑code to clarify how MySQL processes left joins.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding ON vs WHERE in MySQL LEFT JOIN and DBLE Execution Plans

Background : The author, a DBLE development team member, presents a DBLE execution plan for a LEFT JOIN query that includes a sharding column filter in the ON clause, and investigates why routing the query to a single node (Suppose‑1) is not feasible.

ON and WHERE Definitions

ON : The join condition applied while forming the Cartesian product of two tables, creating a virtual table V.

WHERE : The filter applied to the virtual table V after the join, retaining only rows that satisfy the condition.

MySQL LEFT JOIN Behavior

Using an Employee (left) and Info (right) table, the article shows three result scenarios:

Result‑1 : All rows from the left table are displayed; matching rows from the right table are shown, non‑matching rows are NULL.

Result‑2 : Adding an ON condition (e.g., b.country = 'China' ) still shows all left rows, but right‑side columns become NULL when the ON condition fails.

Result‑3 : Adding a WHERE condition (e.g., b.country = 'China' ) filters the virtual table, returning only rows that satisfy the condition.

Why Supposed Optimization Fails

In LEFT JOIN, the left table must always be fully displayed. Supposing the query could be routed to a single node based on a.id = 2 would violate the ON clause semantics, because rows that do not meet the ON condition must still appear with NULLs on the right side. Therefore, the DBLE Explain plan that distributes the sharding table across nodes is correct.

Practical Example and Pseudo‑Code

Data preparation:

create table A(c1 int, c2 int); create table B(c3 int, c4 int);

insert into A values(1,1),(22,22),(333,333);   insert into B values(0,1),(1,1);

MySQL uses a nested‑loop join algorithm for LEFT JOIN. The following pseudo‑code illustrates the process:

for (leftRow in leftTable) {
  boolean matched = false;
  for (rightRow in rightTable) {
    if (ON condition satisfied) {
      print(leftRow + rightRow);
      matched = true;
    }
  }
  if (!matched) {
    print(leftRow + NULL);
  }
}

In the provided diagrams, when the ON condition is always true (e.g., A.c1 > 0 ), the result is the Cartesian product. When the ON condition is selective (e.g., A.c1 = 333 ), only matching rows are combined; non‑matching left rows are paired with NULL.

Conclusion

The article reinforces that ON clauses affect how rows are paired, while WHERE clauses filter the resulting virtual table, and that LEFT JOIN always preserves all left‑table rows, making the Supposed‑1 optimization invalid for such queries.

SQLDatabaseMySQLLEFT JOINWHERE clauseDBLEON clause
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.