Tagged articles
3 articles
Page 1 of 1
dbaplus Community
dbaplus Community
Feb 8, 2023 · Big Data

How Bilibili Scaled Offline Processing Across Multiple Data Centers

This article details Bilibili's multi‑datacenter offline architecture, explaining the capacity challenges, the chosen scale‑out design, and the implementation of job placement, data replication, routing, versioning, throttling, and traffic analysis to efficiently handle massive batch workloads across geographically distributed clusters.

HDFSbandwidth optimizationdata replication
0 likes · 26 min read
How Bilibili Scaled Offline Processing Across Multiple Data Centers
ITPUB
ITPUB
Jul 13, 2022 · Big Data

How Bilibili Scaled Offline Processing Across Multiple Data Centers

This article details Bilibili's multi‑datacenter solution for offline big‑data workloads, covering the challenges of capacity limits, the design of a unit‑based architecture, job placement, data replication, routing, versioning, bandwidth throttling, traffic analysis, and future directions.

HDFSbandwidth optimizationjob placement
0 likes · 29 min read
How Bilibili Scaled Offline Processing Across Multiple Data Centers
Bilibili Tech
Bilibili Tech
Jul 5, 2022 · Big Data

Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili

To overcome rapid data growth and on‑premise capacity limits, Bilibili adopted a scale‑out, unit‑based multi‑datacenter architecture that isolates failures, intelligently places jobs, replicates data via an enhanced DistCp service, routes reads with an IP‑aware HDFS router, and throttles cross‑site traffic, enabling stable offline big‑data processing of hundreds of petabytes while preserving throughput.

HDFSYARNbandwidth optimization
0 likes · 28 min read
Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili