Big Data 22 min read

How ByteDance Scales EB-Level Data: Architecture, BP Model & Real-Time Insights

ByteDance’s data platform, built over seven years, now handles exabyte-scale data and over 100 million TPS, using a hybrid “middle‑platform + Business Partner” model, custom engines like ClickHouse/ByteHouse, agile governance, and a suite of products to support internal and external businesses, illustrating large-scale big-data engineering practices.

Volcano Engine Developer Services

Jan 4, 2022

How ByteDance Scales EB-Level Data: Architecture, BP Model & Real-Time Insights

Facing personalized, diverse data and internal data and business silos, ByteDance’s data platform team spent seven years developing a platform that now manages exabyte‑level data, handles over 100 million TPS during peak hours, and runs tasks requiring tens of thousands of cores across thousands of machines.

Q: How was the data platform built and how has it evolved?

Since 2014, the platform has progressed through several stages: the original Hive‑based reporting stage, a rapid in‑house product development phase replacing commercial tools, a productization and organizational formation phase with a data BP mechanism, a ToB service phase with the “0987” quantitative service standard, and ongoing architecture upgrades.

Q: What architectural challenges and pivots have you encountered?

The team initially selected Kylin for fast queries, but as product needs grew, they switched to Spark and eventually ClickHouse, now operating the largest domestic ClickHouse cluster (over 15,000 nodes, >600 PB data, >2,400 nodes per cluster). ByteHouse, the enterprise version, adds features like self‑developed table engines and hot‑cold data separation.

Q: How does the platform handle massive scale and performance?

Daily data processing reaches millions of tasks with complex dependencies. A self‑developed distributed scheduler provides second‑level scheduling, task tagging, SLA‑based resource control, and automated optimization suggestions to meet latency and reliability requirements.

Q: How do you address data governance at this scale?

Data governance is tackled in two phases: establishing a governance committee for core business, then extending governance capabilities to empower innovative business units, using reusable architectures, processes, and products to lower the governance barrier.

Q: What does “agility” mean for your data platform?

Organizational agility : The data BP model aligns platform teams closely with business needs.

Consumption agility : Real‑time engines deliver second‑level query responses.

Decision agility : Extensive A/B testing enables rapid, data‑driven decisions.

Service agility : New business lines can be onboarded within a week.

Implementation agility : Small teams can quickly deploy solutions with minimal disruption.

Iteration agility : Continuous product iteration keeps pace with fast‑changing business demands.

Q: Can you summarize the current architecture?

The platform consists of two layers: a capability layer (including the LAS lake‑warehouse engine, the ByteHouse OLAP engine, the DateLeap data development and governance platform, and various BI and analytics products) and a solution layer (the data BP model that delivers tailored data solutions to internal and external customers).

Q: What extreme challenges have you faced?

During the 2021 Douyin Spring Festival Gala, traffic peaked at several times the normal level, requiring real‑time metrics for internal decision‑making and external reporting within a 27‑day development window. The team leveraged a unified traffic platform, multi‑datacenter disaster recovery, Flink‑based real‑time computation, and ByteHouse storage to meet stringent latency, stability, and accuracy requirements.

Q: How do you approach big‑data technology development?

The evolution follows three stages: using open‑source components, extending them with custom development, and fully self‑developing when necessary. ByteDance now operates a hybrid 2+3 model, contributing back to the open‑source community where possible.

Q: What future directions should big‑data developers focus on?

Key areas include real‑time data warehousing, intelligent materialized views with machine‑learning‑enhanced query optimization, and privacy‑preserving technologies such as sensitive data discovery, multi‑party computation, data localization, and permission optimization. Strong computer‑science fundamentals and hands‑on experience with large‑scale data environments are essential.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data real-time analytics Data Platform ClickHouse Data Governance ByteDance

Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.