Big Data 12 min read

How Kuaishou Cut Object Storage Costs by 50% with LRC Erasure Coding

Kuaishou reduced half of its massive object storage expenses by redesigning its architecture to use HBase indexing, HDFS large‑file storage, MemoryCache, and a cross‑IDC LRC erasure‑coding warm layer that maintains disaster‑recovery while dynamically moving data from hot to warm to cold tiers.

Kuaishou Big Data
Kuaishou Big Data
Kuaishou Big Data
How Kuaishou Cut Object Storage Costs by 50% with LRC Erasure Coding

1. Kuaishou Object Storage Architecture

Kuaishou stores billions of video files, resulting in several exabytes of data and petabytes of daily growth, which drives extremely high storage costs. The architecture uses HBase for object indexes, stores merged object data as large files on HDFS, and employs a MemoryCache layer for hot objects. Cross‑AZ disaster recovery is achieved with a double‑AZ four‑replica scheme and AZ‑affinity read routing.

2. Overall Solution Selection

To cut storage costs while preserving cross‑IDC resilience, three approaches were considered: deleting unused objects, reducing replica redundancy, and using low‑cost storage media. The team chose erasure coding (EC) combined with an archival warm storage layer, enabling a hot‑→‑warm‑→‑cold data lifecycle.

3. Cross‑IDC EC Design Phase

EC Algorithm Choice : LRC was selected for its balance of reconstruction cost and fault tolerance. Parameters were derived from RS(n+m) formulas, leading to the adoption of LRC(6+3+3) – six source blocks, three global parity blocks, and three local parity blocks – achieving redundancy ≤200% and supporting cross‑IDC recovery.

Block Layout : A continuous block layout was chosen to suit warm data, large HDFS files, high IOPS requirements, and the CDH HDFS version.

Cross‑IDC Resilience : Zookeeper, Journal, DataNode, Active NameNode, and Standby NameNode are distributed across IDC boundaries, and a Router + ObserverNN combination improves read throughput.

4. EC Warm Architecture Deployment Phase

Data EC Process : LRC encoding is scheduled on the hot cluster to avoid cross‑IDC traffic, a unified cross‑IDC bandwidth control module limits impact, and both source and parity blocks are written according to a fixed 12‑block stripe placement strategy. Global and local parity are stored together with matching file prefixes.

Data Fixer Process : Reconstruction follows a "local first, then global, then local" approach, dividing the 12 blocks into seven layers. The goal is to complete repair within the first local round, ensuring that any single‑block loss in an IDC can be locally recovered.

Block Placement Strategy : NameNode supports logical UpgradeDomain concepts and an LRC BlockPlacement policy that groups the 12 blocks into four groups, each spread across different domains, TORs, racks, DN, and disks, guaranteeing that any single device failure results in only one missing block, which can be locally repaired.

Data Correctness Assurance : During EC, intermediate data are written to HDFS and later concatenated into final source and parity files. CRCs for each block are computed and stored in HBase. A checker simulates local and global repairs to verify parity correctness. During fixing, temporary data are written to HDFS, transferred end‑to‑end to the target DataNode, and finalized after CRC validation.

Index Consistency : Parity and source files share a naming prefix; NN bundles operations (rename, delete, settime) to keep them in sync. Parity directories have admin‑only permissions, and source file appends automatically increase replica count and delete the corresponding parity file.

5. Business I/O Impact

After the warm layer is introduced, data transitions from hot to warm are transparent to applications; HBase indexes are updated automatically. Reads are directed to the warm cluster, and any missing data triggers immediate synchronous repair and asynchronous notification to the reconstruction service.

6. Conclusion

Since deployment, the architecture has saved hundreds of petabytes of storage space, achieved a 50% cost reduction, and maintained data reliability without incidents, while continuing to scale for future growth.

Big Datacross-IDCErasure CodingObject StorageKuaishouLRCstorage cost reduction
Kuaishou Big Data
Written by

Kuaishou Big Data

Technology sharing on Kuaishou Big Data, covering big‑data architectures (Hadoop, Spark, Flink, ClickHouse, etc.), data middle‑platform (development, management, services, analytics tools) and data warehouses. Also includes the latest tech updates, big‑data job listings, and information on meetups, talks, and conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.