Databases 19 min read

Building a High‑Performance Cloud‑Native KV Storage System at Baidu

This article describes Baidu's design and implementation of a cloud‑native, high‑performance KV storage platform—UNDB—covering the performance and cloud‑native challenges, engine optimizations, dynamic management, multi‑model architecture, and the resulting cost and reliability improvements for massive search and feed workloads.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Building a High‑Performance Cloud‑Native KV Storage System at Baidu

Since 2016 Baidu has been handling search and information‑feed traffic that generates billions of requests per day, requiring KV storage services to scale to hundreds of petabytes and thousands of servers while keeping operational costs low.

1. Problems and challenges include balancing read/write performance, space amplification, and supporting diverse workloads; adapting the monolithic KV architecture to cloud‑native requirements such as micro‑service granularity, containerization, and dynamic management; and meeting business‑specific needs without sacrificing efficiency.

2. Engine optimization replaces RocksDB with a custom key‑value‑separated engine, introduces adaptive compaction, hot/cold tiering, and global zstd‑dict compression, and leverages Open‑Channel SSDs for full user‑space I/O, achieving >30% performance gains and reducing overall write amplification below 1.1×.

3. Cloud‑native practice implements a three‑layer framework (Operator, control plane, data plane) that provides global data scheduling, multi‑datacenter migration, and dynamic scaling. The system uses Multi‑Raft for consistency, 3‑2‑1 replication for safety, and separates control services to allow horizontal scaling of management operations.

4. Multi‑model storage architecture introduces a unified UNDB framework with a DbProxy layer for load distribution and compute‑storage separation, a libNode layer handling distributed management, protocol, data model, and synchronization, and a pluggable engine layer supporting various storage models.

5. Summary UNDB now serves Baidu's core search and feed services with tens of thousands of instances, handling over a trillion daily requests while cutting infrastructure cost by nearly 50% and achieving fully automated, developer‑managed operations.

distributed systemscloud-nativeperformance optimizationNoSQLKV storageBaidu
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.