Databases 30 min read

Detailed Analysis of Doris SQL Execution Process: Optimizer, Scheduler, and Executor

This article provides a comprehensive walkthrough of Doris's SQL execution pipeline, covering the query optimizer's parsing, rewriting, and plan generation, the scheduler's fragment distribution, and the executor's fragment processing, including code examples of expression rewrite rules, join strategies, and data flow between FE and BE nodes.

Big Data Technology & Architecture

Nov 4, 2024

Detailed Analysis of Doris SQL Execution Process: Optimizer, Scheduler, and Executor

The article begins by introducing the overall SQL execution flow in Doris, which consists of three main modules: the query optimizer, the query scheduler, and the execution engine. It explains how a SQL statement is transformed into a logical plan, then into a distributed physical plan, and finally executed across BE nodes.

1. Query Optimizer – The optimizer parses the SQL text, performs lexical and syntactic analysis, builds an abstract syntax tree (AST), and applies several stages of transformation: lexical parsing, semantic analysis, query rewriting, single-node plan generation, and distributed plan generation. The article details expression rewrite rules such as ExprRewriteRule and its implementation CompoundPredicateWriteRule, showing how predicates like c1=c2 and True are simplified.

It also covers join reordering using a greedy algorithm, predicate push‑down (predicate assignment), and the conversion of the logical plan into a physical plan tree. Code snippets illustrate the creation of join plans and the evaluation of join costs, including the decision between broadcast and shuffle joins.

2. Query Scheduler – After the distributed plan is generated, the scheduler assigns each PlanFragment to specific BE nodes, selects tablet replicas for scanning, and balances load using round‑robin or bucket‑shuffle strategies. The scheduler also handles colocate joins and bucket‑shuffle joins, ensuring data locality and minimizing network traffic.

3. Execution Engine – BE nodes receive fragments via the Coordinator, which creates PlanFragmentExecutor instances. Each executor builds an ExecNode tree, opens the plan, and repeatedly calls get_next on the root node (e.g., a HashJoinNode) to pull data from child nodes, apply joins, aggregations, and finally send results to the upstream fragment or FE through DataSink and VDataStreamSender. The article explains how local and remote data transfers are handled, including the use of VDataStreamRecvr for receiving data and the RPC path PInternalServiceImpl::transmit_block.

Finally, the FE coordinator pulls results from the top fragment using ResultReceiver, buffers rows in VMysqlResultWriter, and returns them to the client via MysqlChannel. The article concludes with a summary of the end‑to‑end flow and references to source code locations for deeper inspection.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL code analysis doris Query Optimizer Distributed Execution join strategies

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.