Tagged articles

Remote Shuffle Service

8 articles · Page 1 of 1

Dec 10, 2025 · Big Data

Vivo’s 800‑Day Journey Optimizing Celeborn Remote Shuffle Service at PB Scale

This technical report details how Vivo’s big‑data platform adopted Celeborn as its remote shuffle service, evaluated alternatives, tuned hardware and software configurations, implemented performance and stability enhancements, and outlines future operational and community‑driven improvements for handling petabyte‑scale shuffle workloads.

Big DataKubernetesRemote Shuffle Service

0 likes · 20 min read

Vivo’s 800‑Day Journey Optimizing Celeborn Remote Shuffle Service at PB Scale

Bilibili Tech

Jan 3, 2025 · Big Data

Evolution and Production Practices of Apache Celeborn Remote Shuffle Service at Bilibili

Bilibili replaced Spark’s unstable External Shuffle Service with a push‑based approach, then deployed Apache Celeborn’s remote shuffle on Kubernetes using HA masters, tiered workers, extensive monitoring, history‑based routing, chaos testing, and seamless Spark, Flink, and MapReduce integration, while planning self‑healing, elastic scaling, and priority‑aware I/O enhancements.

Apache CelebornBig DataFlink

0 likes · 28 min read

Evolution and Production Practices of Apache Celeborn Remote Shuffle Service at Bilibili

360 Smart Cloud

Jul 9, 2024 · Big Data

Understanding Shuffle in Spark: From Native Shuffle to External and Remote Shuffle Services (Uniffle)

This article examines the critical role of shuffle in big‑data processing, compares Spark's native shuffle with the External Shuffle Service (ESS) and Remote Shuffle Service (RSS) solutions, introduces Uniffle's architecture and configuration, and shares practical deployment experiences and performance results within a 360 internal environment.

Big DataExternal Shuffle ServiceRemote Shuffle Service

0 likes · 15 min read

Understanding Shuffle in Spark: From Native Shuffle to External and Remote Shuffle Services (Uniffle)

Huolala Tech

Mar 7, 2024 · Big Data

Integrating Apache Tez with Remote Shuffle Service via Uniffle: HuoLala’s Experience

Facing exploding data volumes and rising cluster costs, HuoLala adopted Apache Tez’s Remote Shuffle Service built on Apache Uniffle, redesigning the Tez client to operate without source modifications, detailing architecture, implementation challenges, testing, stability measures, and future plans to enhance big‑data shuffle performance and cost efficiency.

Apache TezBig DataData Engineering

0 likes · 14 min read

Integrating Apache Tez with Remote Shuffle Service via Uniffle: HuoLala’s Experience

Zhongtong Tech

Dec 14, 2023 · Big Data

How Celeborn Transformed Spark Shuffle Performance at ZTO Express

Facing massive daily Spark shuffle volumes and unstable ETL performance, ZTO Express migrated from the community External Shuffle Service to Celeborn's Remote Shuffle Service, achieving higher disk I/O efficiency, better reliability, reduced network connections, and significant reductions in task failures and job latency.

Big DataRemote Shuffle ServiceShuffle

0 likes · 15 min read

How Celeborn Transformed Spark Shuffle Performance at ZTO Express

JD Tech

Feb 8, 2021 · Big Data

JD Remote Shuffle Service: Design, Implementation, and Performance Evaluation

This article presents JD's self‑developed Remote Shuffle Service for Spark, detailing its architecture, goals, implementation details, performance benchmarks, and real‑world production case studies that demonstrate its impact on shuffle efficiency and system stability in large‑scale data processing.

EtcdRemote Shuffle ServiceShuffle Optimization

0 likes · 17 min read

JD Remote Shuffle Service: Design, Implementation, and Performance Evaluation

JD Retail Technology

Jan 19, 2021 · Big Data

Design, Implementation, and Performance Evaluation of JD's Remote Shuffle Service for Spark

This article describes JD's research and production deployment of a self‑developed Remote Shuffle Service for Spark, covering its motivations, architectural design, cloud‑native features, monitoring, performance benchmarks against external shuffle solutions, and a real‑world promotion‑period case study that demonstrates improved stability and resource efficiency.

Cloud NativePerformanceRemote Shuffle Service

0 likes · 17 min read

Design, Implementation, and Performance Evaluation of JD's Remote Shuffle Service for Spark

Alibaba Cloud Developer

Sep 27, 2020 · Big Data

Why Spark on Kubernetes Needs a Remote Shuffle Service—and How It Boosts Performance

This article examines the challenges of running Spark on Kubernetes, introduces the Remote Shuffle Service architecture to overcome shuffle bottlenecks, details EMR on ACK integration, showcases performance gains with Terasort benchmarks, and outlines future cloud‑native big‑data strategies such as mixed‑cluster and serverless deployments.

EMRPerformanceRemote Shuffle Service

0 likes · 13 min read

Why Spark on Kubernetes Needs a Remote Shuffle Service—and How It Boosts Performance