Databases 13 min read

Design and Challenges of CB‑SQL Changefeed for Distributed Cloud‑Native Databases

The article explains CB‑SQL’s distributed changefeed architecture, its CDC implementation, the challenges of horizontal scalability and transactional ordering, and the innovative RangeFeed mechanism that enables ordered row‑level streams, resolved timestamps, and seamless integration with external systems like Kafka.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Design and Challenges of CB‑SQL Changefeed for Distributed Cloud‑Native Databases

CB‑SQL is a distributed cloud‑native database that aims to support use cases such as full‑text search, big‑data analytics, and push‑notification triggers without requiring users to manage bookkeeping.

The industry‑standard solution for these scenarios is Change Data Capture (CDC), a continuous message stream that records data changes; CB‑SQL refers to its CDC implementation as a changefeed .

A CB‑SQL CHANGEFEED streams real‑time changes from one or more tables. When an SQL statement modifies data, a message is sent to an external system (a “receiver”). For example, executing INSERT INTO users (1, "Carl"), (2, "Petee") may produce the JSON messages {"id": 1, "name": "Carl"} and {"id": 2, "name": "Petee"} . Rather than sending changes directly to downstream services, CB‑SQL typically forwards them to a message broker such as Kafka, where downstream applications can consume them asynchronously.

Challenge : The primary challenge is to make the changefeed horizontally scalable while preserving strong transactional semantics.

In single‑node databases like MySQL, a binlog records changes and the changefeed simply exposes that log. CB‑SQL, however, stores data in ~64 MB “ranges” that are replicated across multiple nodes. Transactions may involve any combination of ranges, allowing them to span the entire cluster, unlike sharded SQL databases where each shard is an independent replica group and transactions cannot cross shards. Consequently, a changefeed for a sharded system can be parallelized per shard, but CB‑SQL must handle cross‑range ordering.

Figure 1: In a sharded SQL database, transactions cannot cross shard boundaries.

Because CB‑SQL transactions can involve any set of ranges, ordering becomes far more complex. Simply placing each transaction in its own stream is unsatisfactory; the system must still allow the changefeed to scale horizontally.

Figure 2: Transactions in CB‑SQL can span multiple nodes.

Innovation : Although a SQL table may span many ranges, each row always resides within a single range. Each range is an independent Raft group with its own write‑ahead log (WAL), which can be followed to generate an ordered changefeed per row. CB‑SQL introduced an internal mechanism called RangeFeed that pushes changes directly from the Raft group instead of polling it.

Because each row stream is independent, they can be horizontally scaled. Processors are colocated with the data they observe, eliminating unnecessary network hops. A single node that handles all watches can also avoid a single point of failure.

Figure 3: Range leaders emit changefeed messages directly to Kafka (or other receivers).

For many use cases, this ordered row‑level stream is sufficient: each message can trigger a push notification, and the ordering is useful for reconstructing transaction state. However, for analytical workloads that require full transaction semantics, additional processing is needed.

Every CB‑SQL transaction uses the same Hybrid Logical Clock (HLC) timestamp for each row. Exposing this timestamp in each change message provides both transaction grouping (by timestamp) and a total order (by timestamp).

The remaining problem is determining when a group of rows belonging to the same transaction is complete. CB‑SQL solves this with a “resolved” timestamp message, which promises that no further rows with a timestamp ≤ the resolved value will be emitted.

Figure 4: A possible ordering of the first few messages for a transaction.

The article walks through scenarios where streams become resolved at different times, how to regroup rows by timestamp to reconstruct transactions, and how to rebuild the database state at any point, even across range splits and node failures.

The design draws inspiration from the paper “Naiad: A Timely Dataflow System,” whose core idea resembles TCP’s receive‑window mechanism.

Our Work

Practical challenges remain, including updating resolved timestamps, handling task failures, managing range splits/merges, incremental catch‑up, and especially online DDL. Online DDL must provide continuous read/write service while handling large data volumes, making consistency guarantees more complex.

To be compatible with the MySQL ecosystem, CB‑SQL must emulate MySQL binlog behavior. This introduces three major challenges:

Automatic CDC startup: Many tables default to binlog enabled; manually starting CDC for each table is labor‑intensive and error‑prone. The system must configure CDC at the DB level while balancing ordering guarantees against the limited ordering capabilities of single‑partition message queues.

Transaction capture: CB‑SQL’s CDC emits row‑level changes without strict global ordering, resulting in out‑of‑order timestamps and possible duplicates. Consumers must regroup rows into transactions using HLC timestamps, filter duplicates, and reorder by timestamp.

Online DDL changes: During DDL, data back‑fill must be captured and sent to the message queue. If DDL fails, back‑filled data must be retracted to maintain binlog‑compatible strong consistency.

Design solutions for these challenges are in progress, and a MySQL‑compatible binlog feature is expected to be released soon.

Contact: [email protected]

KafkaCDCCB-SQLChangefeedRangeFeed
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.