Big Data Technology & Architecture
Apr 20, 2020 · Big Data
How Spark SQL Chooses Join Strategies: Broadcast, Shuffle Hash, and Sort Merge
The article explains Spark SQL's Catalyst optimizer rules for selecting among Broadcast hash join, Shuffle hash join, and Sort‑merge join, covering build‑side determination, size thresholds, broadcast hints, local hash‑map construction, and fallback strategies for non‑equi joins.
Big DataBroadcast JoinShuffle Hash Join
0 likes · 10 min read
