Tagged articles
5 articles
Page 1 of 1
Big Data Technology Architecture
Big Data Technology Architecture
Apr 1, 2021 · Big Data

Spark Adaptive Execution: Dynamic Shuffle Partition, Broadcast Join, and Skew Handling

The article explains the limitations of static shuffle partitions, execution‑plan estimation, and data skew in Spark SQL, and describes how Spark Adaptive Execution can automatically adjust shuffle partition numbers, switch join strategies, and mitigate skew through configurable parameters and code examples.

Adaptive ExecutionBroadcast JoinData Skew
0 likes · 11 min read
Spark Adaptive Execution: Dynamic Shuffle Partition, Broadcast Join, and Skew Handling
dbaplus Community
dbaplus Community
Mar 23, 2020 · Big Data

How to Detect and Resolve Data Skew in Spark and Hadoop

This article explains what data skew is in distributed big‑data systems like Spark and Hadoop, why it hurts performance, how to spot it using the Web UI or key statistics, and presents eight practical mitigation techniques ranging from filtering and shuffle parallelism to custom partitioners and broadcast joins.

Broadcast JoinData SkewHadoop
0 likes · 19 min read
How to Detect and Resolve Data Skew in Spark and Hadoop