Tag

Shuffle Service

1 views collected around this technical thread.

DataFunTalk
DataFunTalk
Dec 31, 2023 · Big Data

Apache Celeborn (Incubating): Addressing Traditional Shuffle Limitations in Big Data Processing

Apache Celeborn (Incubating) is a remote shuffle service designed to overcome the inefficiencies, high storage demands, network overhead, and limited fault tolerance of traditional Spark shuffle implementations by introducing push‑shuffle, partition splitting, columnar shuffle, multi‑layer storage, and elastic, stable, and scalable architectures.

Apache SparkCelebornPerformance Optimization
0 likes · 15 min read
Apache Celeborn (Incubating): Addressing Traditional Shuffle Limitations in Big Data Processing
DataFunTalk
DataFunTalk
Dec 2, 2023 · Big Data

Apache Celeborn: Overview, Architecture, Community, and Future Roadmap

This article introduces Apache Celeborn, explains the challenges of intermediate data in large‑scale compute engines, details its core architecture and design—including master, worker, lifecycle manager and shuffle client—covers its community history, version releases, performance comparisons with Spark ESS, real‑world deployment scenarios, and outlines future development plans.

Apache CelebornFlinkShuffle Service
0 likes · 14 min read
Apache Celeborn: Overview, Architecture, Community, and Future Roadmap
DataFunTalk
DataFunTalk
Aug 5, 2023 · Big Data

Apache Celeborn (Incubating): Design, Performance, Stability, and Elasticity of a Remote Shuffle Service

This article reviews the limitations of traditional Spark shuffle, introduces Apache Celeborn (Incubating) as a remote shuffle service, and details its design for performance, stability, and elasticity, including push shuffle, partition splitting, columnar shuffle, multi‑layer storage, congestion control, and real‑world evaluation.

Apache SparkCelebornShuffle Service
0 likes · 19 min read
Apache Celeborn (Incubating): Design, Performance, Stability, and Elasticity of a Remote Shuffle Service
DataFunTalk
DataFunTalk
May 3, 2023 · Big Data

Shuttle2.0: Enhancing Spark and Flink Shuffle with Distributed Sorting and Adaptive Broadcast

Shuttle2.0 extends OPPO's open‑source high‑availability Spark Remote Shuffle Service to support Flink, introduces a unified stream‑batch data model, pipelines shuffle with distributed sorting, and provides an Adaptive BroadcastJoin solution that dramatically improves performance and stability for large‑scale big‑data workloads.

Adaptive BroadcastDistributed SortingFlink
0 likes · 11 min read
Shuttle2.0: Enhancing Spark and Flink Shuffle with Distributed Sorting and Adaptive Broadcast
ByteDance Cloud Native
ByteDance Cloud Native
Sep 2, 2022 · Big Data

How ByteDance’s Cloud Shuffle Service Boosts Big Data Job Stability and Performance

ByteDance’s Cloud Shuffle Service (CSS) replaces the traditional Pull‑Based Sort Shuffle in Spark, FlinkBatch and MapReduce with a Push‑Based remote shuffle that improves stability, performance and elasticity, supports compute‑storage separation, and delivers significant speedups in large‑scale TPC‑DS benchmarks.

Performance OptimizationRemote ShuffleShuffle Service
0 likes · 11 min read
How ByteDance’s Cloud Shuffle Service Boosts Big Data Job Stability and Performance