Big Data 19 min read

Interview on Kuaishou's Billion‑Scale Big Data Architecture Evolution and Practices

The interview with Kuaishou senior architect Zhao Jianbo details the three‑phase evolution of its trillion‑scale big data platform, covering foundational Hadoop services, real‑time and OLAP extensions, deep customizations, Spring Festival Gala challenges, scheduling innovations, Hadoop usage, and the relationship between big data and cloud architectures.

Java Architect Essentials

Sep 21, 2021

Interview on Kuaishou's Billion‑Scale Big Data Architecture Evolution and Practices

Kuaishou's big data architecture team, established in 2017, has built a trillion‑scale data platform; senior architect Zhao Jianbo discusses its evolution, motivations, and technology selections.

The platform progressed through three stages: (1) foundational services built on CDH Hadoop 2.6, Kafka, HDFS, and HBase; (2) extension to real‑time processing with Flink and OLAP analysis with Druid and Superset, plus an object‑storage service (blobstore) using HDFS/HBase; (3) deep customization of open‑source components, integrating a self‑developed bitmap DB (BitBase), advanced Kafka features, and a new YARN scheduler called KwaiScheduler.

In the deep‑customization phase, storage services received fast‑fail recovery, performance tuning, QPS limiting, and hierarchical protection; Kafka was enhanced with seamless scaling, isolation, and a novel Kafka‑on‑HDFS storage‑compute separation; the KwaiScheduler achieved 20‑30× higher throughput than Apache Hadoop 3.0 and supports plug‑in scheduling policies.

For the 2020 Spring Festival Gala red‑packet activity, Kuaishou implemented overload protection, rate limiting, and flexible availability for HDFS, HBase, and Kafka, built redundant physical links for real‑time pipelines, and introduced job‑priority and YARN‑based resource guarantees to ensure stable offline data production.

The scheduling system adopts queue isolation with priority, per‑business‑line quotas, label‑based physical isolation, fair‑share for adhoc users, and resource preemption mechanisms such as App Slot, providing fine‑grained control over diverse workloads.

Hadoop remains a core backbone: HDFS stores petabyte‑scale data with erasure coding, YARN manages resources for MR, Spark, and Flink jobs, and the custom KwaiScheduler dramatically improves scheduling performance; Spark is gradually replacing MR where suitable.

Regarding Hadoop's future, the interviewee argues that while newer engines excel in real‑time and performance, HDFS and YARN continue to dominate offline analytics, and integration with emerging platforms like Kubernetes will likely evolve rather than replace them.

Finally, big data architecture is positioned as the PaaS layer within cloud architecture; large enterprises still build and heavily customize their own data platforms for performance, security, and specialized needs, leading to continued growth and cloud‑native integration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Big Data Flink Kafka YARN Hadoop Kuaishou

Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.