How Baidu’s PegaDB Redefines Redis with Low‑Cost, High‑Capacity Storage
This article details Baidu Cloud's PegaDB—a Redis‑compatible, high‑capacity, low‑cost distributed KV store—covering its design choices, architecture, performance and replication optimizations, multi‑region active‑active support, native JSON model, community contributions, and future roadmap.
PegaDB Overview
PegaDB is a fully Redis‑compatible, high‑capacity, low‑cost distributed key‑value database developed by Baidu Cloud to address the high memory cost and limited capacity of traditional Redis deployments. It delivers about 70% of Redis performance while costing less than 20% per GB.
Key Features
Complete Redis protocol compatibility for seamless migration.
Horizontal scaling to petabyte‑level storage using SSDs.
Cost reduction of over 80% per GB compared with in‑memory Redis.
Millisecond‑level online data processing.
Active‑active multi‑region architecture with disaster‑recovery capabilities.
Enterprise‑grade features such as tunable consistency, hot/cold data separation, and native JSON support.
Typical Use Cases
Large‑scale data scenarios where Redis storage costs are prohibitive, open‑source KV databases that cannot meet performance or functionality requirements, and hot/cold separation patterns that complicate traditional Cache + DB architectures.
PegaDB is already deployed in core Baidu services such as Fengchao, Feed, Shoubei, Map, and Dumi.
Design and Implementation
Background
Redis’s in‑memory nature leads to high storage costs and a per‑cluster capacity ceiling of about 4 TB, which cannot satisfy Baidu’s massive data needs.
Industry Solutions
Three main categories of Redis‑compatible KV solutions exist: disk‑based systems like Pika/Kvrocks, TiKV‑based systems like Meitu Titan/Tedis, and hybrid approaches like Redis On Flash. Each suffers from scalability, compatibility, or performance limitations.
Design Choice
Baiju selected Kvrocks as the upstream project for further development due to its code simplicity and alignment with Baidu’s requirements.
Kvrocks Introduction
Kvrocks is a distributed KV store built on RocksDB that fully implements the Redis protocol, aiming to solve Redis’s memory cost and capacity constraints.
Cluster Design
PegaDB adopts a Redis‑Cluster‑like slot allocation strategy with a centralized MetaServer managing cluster metadata, enabling elastic scaling and supporting both fixed‑slot and dynamic topology changes.
Scaling and Rebalancing
Data migration uses RocksDB snapshots for full‑copy and WAL logs for incremental copy, moving slots between nodes while minimizing service disruption with short, millisecond‑level write pauses.
Replication Optimizations
PegaDB introduces a Replication ID and monotonic Sequence ID stored in the WAL, enabling partial resynchronization after failover and supporting half‑sync replication with configurable sync replica counts.
Performance Tuning
Extensive engine optimizations include rate‑limited compaction, partitioned indexes, multi‑CF block caches, enable_pipelined_write, and GC pre‑read. Hot‑key caching provides million‑level hot‑key access per node, reducing cache‑DB consistency overhead.
Active‑Active Multi‑Region Architecture
SyncAgent components co‑located with PegaDB instances replicate data across regions using ShardID to prevent loops and OpID for resumable transfers; conflicts are resolved with a simple Last‑Write‑Wins policy.
PJSON Data Model
PegaDB natively supports a JSON data model compatible with RedisJSON, offering JSONPath queries, atomic operations on all JSON value types, and compact encoding that benefits hot‑key caching.
ZSET & HASH Enhancements
Additional commands provide aggregation and result filtering for ZSETs and range operations for HASHes.
Open‑Source Collaboration
The Baidu team actively contributes to the Kvrocks project, delivering PRs for replication, transactions, storage engine, and clustering, and helped Kvrocks become an Apache incubating project.
Future Roadmap
Release a serverless offering to improve elasticity.
Integrate more Redis modules for richer data models.
Provide connectors for seamless big‑data ecosystem integration.
Continue performance enhancements via io_uring and thread‑model optimizations.
Baidu Intelligent Cloud Tech Hub
We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
