Databases 15 min read

How StarRocks Transforms a SQL Query into Distributed Execution: A Deep Dive

This article explains how StarRocks converts a SQL statement into an optimal distributed physical execution plan, schedules the plan across compute nodes, and runs it using MPP, pipeline parallelism, and vectorized execution to achieve near‑linear performance scaling.

StarRocks

Apr 24, 2022

Overview

StarRocks processes a SQL query in three stages: (1) transform the SQL text into an optimal distributed physical execution plan, (2) schedule the plan to compute nodes, and (3) execute the plan on those nodes.

Frontend (FE) and Backend (BE) Roles

FE parses, optimizes, schedules queries and manages metadata. BE executes the plan and stores data.

From SQL Text to Execution Plan

The conversion consists of five steps:

SQL Parse : the SQL string is parsed by ANTLR4 into an abstract syntax tree (AST) represented by a QueryStmt containing SelectList, FromClause, WherePredicate, GroupByClause, HavingPredicate, OrderByElement, LimitElement, etc.

SQL Analyze : the AST is bound to catalog metadata (databases, tables, columns), legality checks are performed (e.g., no GROUPING in WHERE, HLL/Bitmap cannot be summed), aliases are resolved, function argument types are validated, and necessary type casts are inserted. The result is a hierarchical Relation tree.

SQL Logical Plan : the Relation tree is converted into a logical plan tree where each relational operation becomes a logical node.

SQL Optimize : a Cascades/ORCA‑style cost‑based optimizer rewrites the logical plan and selects the lowest‑cost distributed physical plan. Major rewrite rules include:

Expression simplification

Column pruning

Predicate push‑down

Limit merge / push‑down

Aggregation merge

Constant propagation and folding

Outer‑join to inner‑join conversion

Common sub‑expression reuse

Subquery rewrite

Lateral‑join simplification

Partition/bucket pruning

Empty‑node and empty‑set pruning (UNION/INTERSECT/EXCEPT)

Intersect reorder

Count‑distinct aggregation rewrite

Plan Fragment Generation : the chosen physical plan is mapped one‑to‑one into BE‑executable plan fragments.

CBO Transform Details

After logical rewrite, the optimizer applies additional transformations:

Multi‑stage aggregation (e.g., count, sum → two‑stage; count‑distinct → three‑ or four‑stage)

Join side adjustment: the smaller table is built as the hash side; the optimizer may swap join sides or convert LEFT JOIN to RIGHT JOIN based on cost.

Join reordering: for ≤5 tables, exhaustive exchange/associativity search; for >5 tables, greedy + dynamic programming.

Distributed join strategy selection (broadcast, shuffle, single‑side shuffle, colocate, replicated) using cost estimation and property enforcement.

Push‑down aggregate to join.

Materialized view selection and rewrite.

Statistics and Cost Estimation

Accurate table‑level and column‑level statistics (collected automatically or manually, full or sampled) feed the cost model. Cost estimation considers CPU, memory, network, and I/O with different weights per operator.

Execution Scheduling

FE creates execution instances for each fragment, assigns fragments to BE nodes, selects healthy tablet replicas for scans, and sends fragment parameters to BE via Thrift. Scheduling follows an “All‑At‑Once” top‑down traversal of the fragment tree.

MPP Distributed Execution

StarRocks splits a query into many fragment instances that run on separate BE nodes, achieving near‑linear scalability as the cluster grows.

Pipeline Parallelism

Within a fragment, pipelines are created. Each pipeline consists of a source operator, zero or more transform operators, and a sink operator. Multiple pipeline drivers run in parallel, and a user‑space coroutine scheduler minimizes thread overhead.

Vectorized Execution

Operators and expressions are rewritten to process data in columnar batches, reducing virtual‑function calls and improving CPU cache usage. The vectorized engine delivers 5‑10× performance gains over row‑wise execution.

Summary

StarRocks transforms a SQL query into an optimal distributed physical plan using a powerful CBO optimizer, schedules plan fragments to appropriate BE nodes, and executes them with MPP, pipeline, and vectorized engines, achieving high performance that scales with both cluster size and CPU cores.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

StarRocks distributed database MPP Vectorized Execution CBO optimizer SQL query processing

Written by

StarRocks

StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.