Tagged articles
2 articles
Page 1 of 1
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 1, 2021 · Big Data

Understanding Spark Shuffle: Mechanisms, Evolution, and Optimization

This article provides a comprehensive overview of Spark's shuffle process, explaining its definition, internal mechanisms such as shuffle write and read, the evolution of shuffle managers, and practical optimization techniques including parameter tuning and broadcast variables, all aimed at improving performance in large‑scale data processing.

Big DataShuffleShuffle Reader
0 likes · 18 min read
Understanding Spark Shuffle: Mechanisms, Evolution, and Optimization
Sohu Tech Products
Sohu Tech Products
Feb 13, 2019 · Big Data

Evolution and Implementation Details of Spark Shuffle Mechanisms

This article examines the historical evolution of Spark's shuffle implementations—from early Hash‑Based Shuffle to modern SortShuffleWriter, BypassMergeSortShuffleWriter, and UnsafeShuffleWriter—explaining their design choices, selection criteria, and the corresponding shuffle reader architecture in a production‑grade Spark 2.1.1 environment.

Big DataShuffleShuffle Writer
0 likes · 13 min read
Evolution and Implementation Details of Spark Shuffle Mechanisms