Databases 12 min read

Understanding JOIN LATERAL: From Traditional Databases to Apache Flink

This article explains the special JOIN LATERAL operator, compares it with INNER JOIN and correlated subqueries, shows how SQL Server implements it via CROSS APPLY, and demonstrates its support in Apache Flink using Calcite and user‑defined table functions with concrete code examples.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding JOIN LATERAL: From Traditional Databases to Apache Flink

The article continues the series on SQL JOIN operators by introducing the special JOIN LATERAL construct, which differs from ordinary joins because the right side is a view or table‑valued function rather than a physical table.

Using a concrete example with Customers and Orders tables, the author first shows a standard INNER JOIN query, then attempts to solve the same requirement with correlated subqueries in the WHERE and FROM clauses, highlighting their limitations and errors.

To overcome these issues, the article explains the ANSI‑SQL LATERAL keyword and its equivalent in SQL Server, CROSS APPLY. It details why SQL Server chose CROSS APPLY historically and demonstrates that both syntaxes produce identical execution plans.

Performance and functional differences between INNER JOIN and LATERAL / CROSS APPLY are discussed, showing that CROSS APPLY often yields a simpler plan when selecting the first matching row per customer.

The article then shifts to Apache Flink, describing how Flink leverages Calcite for SQL parsing and fully supports the LATERAL syntax. An example using a lateral join with a user‑defined table function (UDTF) is provided.

SELECT e.NAME, e.DEPTNO, d.NAME
FROM EMPS e, LATERAL (
    SELECT *
    FROM DEPTS d
    WHERE e.DEPTNO = d.DEPTNO
) AS d;

Flink’s support for LATERAL is illustrated through two scenarios: UDTF (user‑defined table function) and temporal tables. The article focuses on the UDTF case, presenting the abstract base class for user‑defined functions and the concrete TableFunction implementation.

/**
  * Base class for all user‑defined functions such as scalar functions, table functions,
  * or aggregation functions.
  */
abstract class UserDefinedFunction extends Serializable {
  def open(context: FunctionContext): Unit = {}
  def close(): Unit = {}
}
abstract class TableFunction[T] extends UserDefinedFunction {
  def getResultType: TypeInformation[T] = null
}

A concrete UDTF SplitTVF is defined to split strings of the form "<name>#<age>" into separate fields.

case class SimpleUser(name: String, age: Int)
class SplitTVF extends TableFunction[SimpleUser] {
  def eval(user: String): Unit = {
    if (user.contains("#")) {
      val splits = user.split("#")
      collect(SimpleUser(splits(0), splits(1).toInt))
    }
  }
}

A full integration test shows how to register the table and function, then run a lateral join query.

@Test
def testLateralTVF(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  val tEnv = TableEnvironment.getTableEnvironment(env)
  env.setStateBackend(getStateBackend)
  StreamITCase.clear

  val userData = new mutable.MutableList[(String)]
  userData += ("Sunny#8")
  userData += ("Kevin#36")
  userData += ("Panpan#36")

  val SQLQuery = "SELECT data, name, age FROM userTab, LATERAL TABLE(splitTVF(data)) AS T(name, age)"

  val users = env.fromCollection(userData).toTable(tEnv, 'data)

  val tvf = new SplitTVF()
  tEnv.registerTable("userTab", users)
  tEnv.registerFunction("splitTVF", tvf)

  val result = tEnv.SQLQuery(SQLQuery).toAppendStream[Row]
  result.addSink(new StreamITCase.StringSink[Row])
  env.execute()
  StreamITCase.testResults.foreach(println(_))
}

The article concludes that JOIN LATERAL provides a powerful way to join each row of a left table with the result of a table‑valued function, offering both functional flexibility and performance benefits, and that Apache Flink fully supports this pattern through Calcite and UDTFs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLdatabaseApache FlinkLATERALUDTF
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.