Tagged articles
8 articles
Page 1 of 1
vivo Internet Technology
vivo Internet Technology
Dec 10, 2025 · Big Data

Vivo’s 800‑Day Journey Optimizing Celeborn Remote Shuffle Service at PB Scale

This technical report details how Vivo’s big‑data platform adopted Celeborn as its remote shuffle service, evaluated alternatives, tuned hardware and software configurations, implemented performance and stability enhancements, and outlines future operational and community‑driven improvements for handling petabyte‑scale shuffle workloads.

Big DataKubernetesRemote Shuffle Service
0 likes · 20 min read
Vivo’s 800‑Day Journey Optimizing Celeborn Remote Shuffle Service at PB Scale
DataFunTalk
DataFunTalk
Feb 20, 2025 · Big Data

From Integrated Storage‑Compute to Decoupled Architecture: Practical Exploration of Kubernetes, Kyuubi, Celeborn, Blaze, and Hue in Big Data Platforms

This article analyzes the transition from a tightly coupled storage‑compute architecture to a decoupled model, detailing how Kubernetes, Kyuubi, Celeborn, Blaze, and Hue together solve resource inefficiencies, improve scalability, and boost query performance in modern big‑data environments.

Big DataBlazeKubernetes
0 likes · 16 min read
From Integrated Storage‑Compute to Decoupled Architecture: Practical Exploration of Kubernetes, Kyuubi, Celeborn, Blaze, and Hue in Big Data Platforms
dbaplus Community
dbaplus Community
Sep 4, 2024 · Big Data

How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn

This article details how Ctrip’s data platform evolved from a single‑IDC design to a multi‑IDC, tiered storage and scheduling architecture, covering the challenges of rapid data growth, the migration to Spark 3 via Kyuubi, the introduction of Celeborn shuffle service, and the resulting performance and reliability gains.

Big DataHDFSKyuubi
0 likes · 23 min read
How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn
DataFunTalk
DataFunTalk
Jun 22, 2024 · Big Data

Migrating Spark Shuffle Service from ESS to RSS (Celeborn) at Zhihu: Design, Implementation, and Benefits

This article details Zhihu's migration of massive Spark and MapReduce shuffle workloads from the External Shuffle Service (ESS) to a push‑based Remote Shuffle Service (RSS) powered by Celeborn, covering background problems, evaluation of open‑source implementations, deployment architecture, encountered issues, solutions, performance gains, and future plans.

Big DataRSSShuffle
0 likes · 19 min read
Migrating Spark Shuffle Service from ESS to RSS (Celeborn) at Zhihu: Design, Implementation, and Benefits
DataFunTalk
DataFunTalk
Dec 31, 2023 · Big Data

Apache Celeborn (Incubating): Addressing Traditional Shuffle Limitations in Big Data Processing

Apache Celeborn (Incubating) is a remote shuffle service designed to overcome the inefficiencies, high storage demands, network overhead, and limited fault tolerance of traditional Spark shuffle implementations by introducing push‑shuffle, partition splitting, columnar shuffle, multi‑layer storage, and elastic, stable, and scalable architectures.

Apache SparkBig DataPerformance Optimization
0 likes · 15 min read
Apache Celeborn (Incubating): Addressing Traditional Shuffle Limitations in Big Data Processing
Zhongtong Tech
Zhongtong Tech
Dec 14, 2023 · Big Data

How Celeborn Transformed Spark Shuffle Performance at ZTO Express

Facing massive daily Spark shuffle volumes and unstable ETL performance, ZTO Express migrated from the community External Shuffle Service to Celeborn's Remote Shuffle Service, achieving higher disk I/O efficiency, better reliability, reduced network connections, and significant reductions in task failures and job latency.

Big DataRemote Shuffle ServiceShuffle
0 likes · 15 min read
How Celeborn Transformed Spark Shuffle Performance at ZTO Express
DataFunSummit
DataFunSummit
Nov 25, 2023 · Big Data

Practical Experience with Apache Kyuubi and Celeborn on the DXY Big Data Platform

This article presents a comprehensive technical overview of how DXY's big data platform leverages Apache Kyuubi and Celeborn to unify Spark entry points, configure flexible task isolation, implement fine‑grained AuthZ, optimize small files and Z‑Order sorting, and accelerate large result set transmission with Arrow, while also discussing operational challenges and upcoming features.

Apache KyuubiArrowBig Data
0 likes · 17 min read
Practical Experience with Apache Kyuubi and Celeborn on the DXY Big Data Platform
DataFunTalk
DataFunTalk
Aug 5, 2023 · Big Data

Apache Celeborn (Incubating): Design, Performance, Stability, and Elasticity of a Remote Shuffle Service

This article reviews the limitations of traditional Spark shuffle, introduces Apache Celeborn (Incubating) as a remote shuffle service, and details its design for performance, stability, and elasticity, including push shuffle, partition splitting, columnar shuffle, multi‑layer storage, congestion control, and real‑world evaluation.

Apache SparkBig DataShuffle Service
0 likes · 19 min read
Apache Celeborn (Incubating): Design, Performance, Stability, and Elasticity of a Remote Shuffle Service