Big Data 12 min read

Kuaishou's HDFS Architecture, Scale, Challenges, and Practices

This article presents an in‑depth technical overview of Kuaishou's massive HDFS deployment, detailing its architecture, petabyte‑scale data and thousands‑of‑node clusters, the key scalability challenges faced, and the custom solutions—including FixedOrder, RBF balancer, observer read, slow‑node mitigation, and tiered protection—implemented to keep the system performant and reliable.

DataFunTalk

Mar 27, 2021

Kuaishou's HDFS Architecture, Scale, Challenges, and Practices

Kuaishou operates the largest internal distributed file system based on Hadoop Distributed File System (HDFS), which has grown alongside its rapidly expanding business. The system now spans tens of thousands of servers, stores several exabytes of data, and handles hundreds of petabytes of daily throughput.

HDFS Architecture Overview

HDFS follows a classic master‑slave design with a NameNode (metadata service) and multiple DataNodes (block storage). The NameNode stores directory trees, file block mappings, and block locations, while DataNodes hold the actual data blocks.

Scale at Kuaishou

Thousands of servers (tens of thousands in total)

Multiple exabytes of stored data and hundreds of petabytes of daily throughput

Hundreds of NameService groups and millions of QPS

Key Challenges and Practices

Master‑node scalability : Kuaishou introduced a FixedOrder mount point and an RBF‑Balancer mechanism to quickly add new NameService instances and distribute load without data migration.

Single‑NameService performance bottleneck : A custom observer‑read architecture built on the latest RBF framework provides transparent read‑write separation, dynamic routing, and supports standby/active/observer nodes for read requests.

Slow‑node issues : Implemented pre‑emptive avoidance (ranking DataNodes by load) and mid‑operation circuit‑breakers, along with DN‑side optimizations such as directory hierarchy flattening, finer‑grained locking, DU operation improvements, and load‑aware disk selection.

Tiered protection : Introduced priority queues in NameNode RPC handling and per‑disk throttlers in DataNodes, allowing high‑priority tasks to pre‑empt resources when the cluster is saturated.

Final Architecture

The production architecture consists of three layers: a stateless routing layer (Routers sharing mount‑point info via ZooKeeper), a metadata layer (multiple independent NameService groups with Active, Standby, and Observer NameNodes), and a data layer (large pools of DataNodes reporting heartbeats and block reports).

Conclusion

Over three years, Kuaishou's HDFS evolved from a few hundred nodes handling petabyte‑scale data to a massive cluster of tens of thousands of nodes supporting exabyte‑scale storage. The team continuously refines the system to meet explosive data growth, sharing many bug fixes and performance improvements with the open‑source community.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Engineering Performance Optimization Big Data scalability HDFS Kuaishou

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.