Doris Join Optimization, Runtime Filters, and Reorder Techniques
This article explains Doris's data partitioning, the four join methods (Broadcast, Shuffle, Bucket Shuffle, and Colocate), runtime filter mechanisms and types, join reorder strategies, and practical tuning recommendations to improve query performance in the Doris MPP database.
1. Doris Data Partitioning
Before discussing Doris's various join methods and optimization principles, it is useful to review Doris's two‑level data partitioning and tablet replication mechanism.
Doris supports a two‑layer partitioning scheme: the first layer is Range Partition , and the second layer is Hash Bucket (Tablet) . Tables are horizontally split into multiple tablets (also called buckets) according to partition‑and‑bucket rules, and each tablet is stored on different BE nodes with multiple replicas (default is three).
Tablets are independent physical storage units; they contain disjoint data and are the smallest unit for data movement and replication.
The relationship among Table, Partition, and Bucket (Tablet) is illustrated below:
Assume a table is range‑partitioned by a date column, producing N partitions.
Each partition is further hashed into M buckets (tablets).
Logically, a bucket can contain data from multiple partitions (e.g., Tablet 11, Tablet 21, Tablet N1).
Partition
Logical concept: partitions split data into distinct intervals, enabling partition pruning during query execution to reduce scanned data and improve performance.
Bucket
Physical concept: Doris typically uses a hash algorithm for bucketing. Within the same partition, rows with the same bucket key hash are placed into the same tablet, which is stored with multiple replicas and serves as the smallest unit for data balancing and recovery.
2. Join Methods
2.1 Overview
As a distributed MPP database, Doris performs a data shuffle during joins to ensure correct results.
Doris supports four join types: Broadcast Join , Shuffle Join , Bucket Shuffle Join , and Colocate Join . Their flexibility and data‑distribution requirements increase from Broadcast to Colocate, while network overhead decreases.
The planner prefers the order: Colocate Join → Bucket Shuffle Join → Broadcast Join → Shuffle Join. If Colocate or Bucket Shuffle cannot be used, Doris falls back to Broadcast Join (large table joins small table); if the small table is still too large, it switches to Shuffle Join.
Users can also force a specific join type with a hint, e.g.:
select k1 from t1 join [BUCKET] t2 on t1.k1 = t2.k2 group by t2.k2;2.2 Broadcast Join
SELECT * FROM A, B WHERE A.column = B.column;The entire B table is broadcast to all nodes of A, eliminating the need to shuffle A's data. This is suitable when B is a small table.
2.3 Shuffle Join
SELECT * FROM A, B WHERE A.column = B.column;Rows are hashed on the join columns and redistributed so that matching rows end up on the same node. Network cost equals the size of A plus B; memory cost equals B.
2.4 Bucket Shuffle Join
SELECT * FROM A, B WHERE A.distributekey = B.anycolumn;Improves on Broadcast and Shuffle by shuffling B according to A's distribution, so B is transferred only once globally, drastically reducing network traffic.
2.5 Colocate Join
SELECT * FROM A, B WHERE A.colocatecolumn = B.colocatecolumn;If A and B belong to the same colocate group, their data are already co‑located on the same nodes, allowing a purely local join with zero network cost.
2.6 Comparison of the Four Shuffle Strategies
3. Runtime Filter Optimization
3.1 Overview
Beyond traditional indexes, Doris introduces a dynamic filtering mechanism called Runtime Filter . During a multi‑table join, the right table (BuildTable) builds a hash table while simultaneously generating a filter structure (e.g., Min/Max, IN) that is pushed down to the left table (ProbeTable) to prune data early.
This dramatically reduces the amount of data that the probe side needs to read, transfer, and compare, especially in large‑table joins.
3.2 Runtime Filter Join Principle
While constructing the hash table on the right side, Doris also creates a filter based on the hash table data and pushes it to the scan nodes of the left side, enabling runtime data filtering.
3.3 Runtime Filter Types
3.3.1 IN
Pushes a hash set to the scan node.
Pros: obvious and fast filtering.
Cons: only works with Broadcast Join; disabled when the right table exceeds a configured size (default 1024 rows).
3.3.2 Bloom Filter
Builds a Bloom filter from the hash table data and pushes it down.
Pros: generic and effective for many scenarios.
Cons: more complex configuration and higher computational cost.
3.3.3 MinMax
Derives a range (min/max) from the right table and pushes it to the scan node.
Pros: low overhead.
Cons: works well for numeric columns but offers little benefit for non‑numeric columns.
4. Join Reorder
4.1 Overview
When multiple tables are joined, the execution order significantly impacts performance. Join Reorder aims to place high‑selectivity joins early to reduce intermediate result sizes.
4.2 Algorithm
Typical rule‑based algorithm used by Doris:
Prefer joining large tables with small tables to keep intermediate results small.
Push joins with filter conditions forward.
Prioritize Hash Join over Nested Loop Join.
5. Join Tuning Recommendations
5.1 General Advice
Use identical simple data types for join columns; avoid CAST operations.
Prefer key columns for joins to leverage Doris's delayed materialization.
For large‑table joins, enable Colocate to eliminate shuffle overhead.
Apply appropriate Runtime Filters (IN, Bloom, MinMax) based on filter selectivity.
Follow the “big‑table on the left, small‑table on the right” principle; Hash Join outperforms NLJ.
When necessary, rewrite SQL or add hints to adjust join order.
5.2 Practical Steps
1. Use Doris's query Profile to locate bottlenecks.
2. Understand Doris's join mechanisms to diagnose slowness.
3. Adjust session variables to influence join behavior.
4. Verify the effect by examining the Query Plan .
These four steps constitute a standard join‑optimization workflow; if they do not resolve the issue, consider deeper data redistribution or manual query rewriting.
Additional developer‑level tips reiterate the above points, emphasizing consistent data types, key columns, colocate usage, runtime filters, and left‑big‑right‑small join ordering.
If you found this article helpful, remember to like, share, and bookmark it.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
