How BaikalDB Redefines Cloud‑Native Distributed Databases for Modern Business Needs
This article examines the evolving data‑storage demands of large‑scale commercial advertising systems, traces the development of BaikalDB from early MySQL sharding to a heterogeneous, cloud‑native distributed database, and explains its storage, compute, and scheduling designs that deliver high reliability, low cost, and millisecond‑level performance.
1. Business Product System Data Storage Requirements
Commercial advertising platforms need a storage layer that satisfies several distinct workloads:
Transactional OLTP workloads for ad delivery and bidding.
Analytical OLAP workloads for ad‑effect analysis.
High‑QPS point‑lookup queries such as account structures and permission checks.
Exact KV look‑ups for keyword ↔ ID mapping.
Fuzzy searches for material lists.
To meet these diverse needs, a traditional stack would require MySQL, Redis, an OLAP warehouse, Elasticsearch, and custom in‑memory stores.
Key expectations from the storage facility are:
Stability and reliability – downtime directly harms user experience and revenue.
Strong consistency – inconsistent data leads to mis‑understanding and erroneous ad placements.
Low cost – capacity must grow without massive hardware purchases.
High read/write performance – millisecond‑level latency is essential.
Developers also desire:
Simple, uniform APIs with low learning and migration cost.
Predictable data‑change behavior without loss or anomalies.
Scalable architecture that grows from 1 to N without service impact.
Built‑in high availability and fault tolerance.
Cheap schema evolution.
Consistently good performance for all read/write patterns.
2. Evolution of BaikalDB
2.1 Sharded MySQL Cluster
The earliest ad‑library used a single MySQL instance on high‑performance disks. As data volume and traffic grew, sharding (splitting databases and tables) was adopted, expanding from a single node to 33 shards, each with 1 master and 11 replicas, storing tens of terabytes and handling billions of daily PVs.
2.2 Heterogeneous Composite Storage Cluster
Read‑heavy, write‑light workloads exposed limitations of pure MySQL. BaikalDB introduced a real‑time sync pipeline that mirrors MySQL data into memory‑optimized indexes tailored for specific query patterns. A SQL‑proxy routes queries to the appropriate store while preserving the MySQL protocol, creating a heterogeneous composite storage architecture.
Drawbacks include increased operational complexity, potential sync latency causing data inconsistency, and resource redundancy due to duplicated indexes.
2.3 The 2017 Decision
By 2017 the system had reached 33 shards on NVMe SSDs. Further sharding would be costly, prompting evaluation of alternatives such as deep MySQL customizations (Aurora, PolarDB) and emerging distributed databases (CockroachDB, TiDB). The team decided to build a new HTAP‑style database capable of both OLTP and OLAP workloads.
2.3.1 Deep MySQL Customization
Modifying MySQL internals to achieve F1/Spanner‑level capabilities proved impractical.
2.3.2 Building a New Distributed Database
Leveraging Baidu’s mature brpc (RPC), braft (Raft), and RocksDB (KV engine), the team designed a distributed system with strong consistency, automatic replication, and multi‑tenant support.
2.4 BaikalDB – The Next‑Generation Storage System
BaikalDB is a MySQL‑compatible, cloud‑native distributed database offering:
Flexible cloud deployment – container‑friendly, linear scaling, low cost, no special hardware.
One‑stop storage‑compute – primarily OLTP, with OLAP, full‑text, and high‑performance KV capabilities.
MySQL protocol compatibility – low learning curve for developers.
Its name derives from Lake Baikal, the world’s largest freshwater lake, symbolizing massive, orderly data storage.
3. Key Design Considerations and Practices
3.1 Storage Layer Design
BaikalDB stores data on disk using RocksDB (a KV engine). Tables are mapped to KV pairs: primary keys become clustered indexes, secondary indexes (local or global) and full‑text indexes are also represented as KV entries. Data is partitioned into Regions (the smallest management unit) which are sharded by range to avoid hotspot issues.
Each Region is replicated three times using Multi‑Raft for strong consistency. Primary indexes embed region_id and index_id; values hold protobuf‑encoded rows. Secondary indexes may be local (per‑Region) or global (across Regions), with trade‑offs in query routing and transaction support.
Full‑text indexing splits text into terms, builds ordered inverted lists, and stores term → sorted primary keys as KV entries.
3.2 Compute Layer Design
The SQL layer parses queries into a distributed execution plan using a volcano‑style operator model (open/next/close). BaikalDB pushes filters down to storage nodes, performs join and aggregation locally when possible, and combines results at a BaikalDB node. Query optimization combines rule‑based (RBO) and cost‑based (CBO) approaches, leveraging statistics and a cost model.
3.3 Scheduling Layer Design
A Master component (BaikalMeta) collects heartbeats from storage nodes (BaikalStore), evaluates load, and issues balancing decisions for Leader and Peer distribution. Leader balancing evens read/write pressure; Peer balancing spreads replicas across nodes and zones to improve fault tolerance.
Region splitting is triggered when a Region exceeds a size threshold, creating new Range‑based Regions and rebalancing them automatically.
4. Conclusion
Starting from the concrete storage demands of a massive advertising platform, BaikalDB evolved through four major stages to become a unified, cloud‑native, MySQL‑compatible distributed database. It integrates the functions of multiple legacy storage systems, supports diverse workloads, and continues to be refined for new commercial scenarios such as landing pages and e‑commerce.
BaikalDB demonstrates that a well‑designed storage, compute, and scheduling architecture can deliver high reliability, low cost, and strong performance, positioning databases as a foundational system software alongside operating systems and compilers.
For more details, see the open‑source project at github.com/baidu/BaikalDB .
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
