Bilibili Data Center Migration: Planning, Execution, and Lessons Learned
Bilibili’s 18‑month, multi‑regional data‑center migration moved tens of thousands of servers using a high‑frequency rolling strategy, combining meticulous planning, cross‑team coordination, automated rack placement and rigorous checklists to achieve significant cost savings, higher utilization, improved stability and greener operations.
The article presents a comprehensive case study of Bilibili's data‑center relocation project, which spanned 18 months across multiple regions in the Yangtze River Delta and involved moving tens of thousands of servers and networking devices to a new, more advanced facility.
Motivated by rapid business growth, the original data center had become outdated, saturated, and costly. To support multi‑active deployment, improve resource utilization, and enhance operational stability, Bilibili adopted a high‑frequency, rolling migration strategy that balances continuous service with efficient relocation.
The migration was the most complex and largest‑scale data‑center move in Bilibili's history, involving four source sites and covering nearly all online and offline services. Major challenges included large scale, long project duration, intricate scheduling, extensive coordination among dozens of teams (system, resource operations, infrastructure, procurement, business units, vendors), and strict efficiency requirements.
Key project management difficulties were highlighted: coordinating external processes such as customs clearance and logistics, aligning internal business shutdown and startup windows, and ensuring seamless handover of equipment. Weekly migration batches moved over 500 devices on average, with peak batches exceeding 1,700 units, demanding meticulous technical planning and rapid post‑migration handover within three days.
The overall migration plan comprised three major components: project evaluation, a detailed overall schedule, and pre‑migration preparations. Evaluation covered current infrastructure status, cost‑benefit analysis, and risk assessment, concluding that the move would yield significant cost savings, improved PUE, and higher server utilization.
Pre‑migration steps included inventorying all equipment, selecting qualified moving vendors, designing the new data‑center layout (rack placement, power density, network port allocation), and establishing dedicated inter‑site links to meet bandwidth and latency requirements for both online and offline workloads.
During execution, the team implemented automated rack‑placement algorithms, standardized IP configuration, and automated workflows to reduce manual errors. Physical migration emphasized safety, logistics planning, labeling, and insurance for high‑value assets. Post‑migration, consistency checks verified BIOS/BMC settings, OS configurations, and business‑specific customizations.
A detailed checklist guided each phase—from demand confirmation, vendor selection, and plan approval to on‑site inspection, labeling, packaging, rack‑up, power‑on testing, and final acceptance. The checklist ensured that no step was omitted and facilitated smooth handover.
In the concluding outlook, the new data center aligns with China’s “dual‑carbon” goals, featuring greener, energy‑efficient designs that lower overall PUE and reduce carbon emissions. Operational metrics improved markedly: CPU utilization rose from ~25% to >35%, network architecture upgrades enhanced transmission efficiency, and systematic hardware refreshes reduced failure rates. The migration also enabled a thorough cleanup of legacy services and data.
Overall, the project demonstrates how systematic planning, cross‑departmental coordination, automation, and rigorous risk management can successfully execute large‑scale infrastructure migrations while delivering cost savings, performance gains, and environmental benefits.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.