Backend Development 13 min read

Unlocking TiCDC: Efficient Incremental Data Sync for TiDB in Real‑World Scenarios

This article explains how TiCDC, a change‑data‑capture tool for TiDB, addresses incremental extraction, cross‑region hot‑standby, and stream processing needs, outlines its architecture, discusses early‑version issues, and provides best‑practice recommendations for stable, high‑performance data synchronization.

Xiaolei Talks DB
Xiaolei Talks DB
Xiaolei Talks DB
Unlocking TiCDC: Efficient Incremental Data Sync for TiDB in Real‑World Scenarios

TiCDC is an incremental data synchronization tool for TiDB that pulls TiKV change logs, can restore data to any upstream TSO state, and offers an open data protocol for other systems to subscribe to data changes.

TiCDC Business Use Cases

1) Incremental Data Extraction

Data warehouses need nightly ETL to extract incremental or full data based on timestamps. Extracting large dimension tables (200 billion rows, >5 TB each) from a 70‑node TiDB cluster overloads CPU, memory, and network, causing failures and delayed reporting. Using TiCDC to write changes to Kafka provides a needed incremental solution.

2) Cross‑Region Dual‑Cluster Hot‑Standby

When the primary data center fails, a standby TiDB cluster can take over. TiCDC can sync data to the standby, offload read traffic, and enable rapid failover.

3) Stream Processing

Previously, MySQL binlog tools (Maxwell, Canal) fed Kafka for real‑time jobs and Flink table joins. After migrating MySQL to TiDB, TiCDC must reliably support this workflow, handling DDL compatibility and sync interruptions.

TiCDC Architecture

The architecture consists of four main modules: TiKV cluster, PD, TiCDC cluster, and downstream sink components.

1. TiCDC Cluster

The cluster runs multiple capture processes, each handling change‑log pulling, sorting, and forwarding.

Puller: pulls TiKV change logs and sorts them per table.

Processor: each processor thread syncs one or more tables; a capture node can run many processors.

High availability: captures have owner/non‑owner roles; the owner schedules tasks and registers with PD. If the owner fails, a new one is elected, and tasks are reassigned. Check capture status with tiup ctl cdc capture list --pd=http://xxxxx:2379

Changefeed: a sync task can target a TiDB instance or filter specific databases/tables. Create a changefeed: tiup ctl:v4.0.14 cdc changefeed create --pd=http://pd-ip:2379 --sink-uri="mysql://User:password@vip:4000/" --changefeed-id="sync-name" --start-ts=0 --config=./ticdc.yaml List changefeeds: tiup ctl:v4.0.14 cdc changefeed list --pd=http://pd-ip:2379

2. TiKV Cluster

Writes are first persisted to a WAL (Write‑Ahead Log), which serves as the KV change log. TiCDC pulls row‑change events from this log.

3. PD Cluster

PD manages global TSO allocation, metadata, and scheduling. For TiCDC it stores changefeed configurations, capture metadata, owner election info, and processor sync status.

4. Sink Components

Downstream destinations include:

MySQL‑compatible databases (MySQL, TiDB)

Kafka/Pulsar for consumption by Flink or other stream processors

Incremental backup storage such as S3

Issues in Early Versions

OOM under high write spikes

OOM after restart when large writes occurred during interruption

Write‑write conflicts when the downstream is TiDB

Sort‑dir defaulted to system disk, causing disk‑full failures

Unbounded concurrency in log pulling overloaded TiKV nodes

Large DDL operations blocked change log sync (fixed in 4.0.14+)

Best Practices

Use the latest TiCDC version (4.0.14/5.1.1) to benefit from bug fixes.

Monitor TiCDC via its dashboard; see changelog pull speed and sink status. Docs: https://docs.pingcap.com/zh/tidb/stable/monitor-ticdc

When writing to Kafka, choose compatible protocols (canal, canal‑json, avro, maxwell) for seamless migration.

Split high‑write tables into multiple changefeeds; query STATEMENTS_SUMMARY_HISTORY to identify hot tables. select digest,DIGEST_TEXT,TABLE_NAMES,sum(EXEC_COUNT),min(SUMMARY_BEGIN_TIME),max(SUMMARY_END_TIME) from STATEMENTS_SUMMARY_HISTORY where stmt_type='insert' and schema_name='ad_monitor' group by digest order by sum(exec_count) desc limit 20;

Set Kafka partition‑num to 1 to preserve order across consumers.

Configure per-table-memory-quota to 6 MiB to avoid goroutine OOM: <code>tiup edit-config tidb-cluster-name cdc: per-table-memory-quota: 6291456</code>

Limit TiKV pull speed (e.g., 20 MiB) to reduce impact on the cluster: <code>tiup edit-config tidb-cluster-name tikv: cdc.incremental-scan-speed-limit: 20MB</code>

Adjust gc-ttl (default 24 h) for long‑running syncs; remove stale changefeeds with: <code>tiup ctl:v4.0.14 cdc changefeed remove --pd=http://pd-ip:2379 --changefeed-id=TICDC-XXX</code>

When using TiCDC for cross‑region sync, increase GC lifetime to avoid log loss.

Large transactions (>5 GB) may cause OOM; be aware of TiDB’s 10 GB limit.

On sync interruption, query error details: <code>tiup ctl:v4.0.14 cdc changefeed query -s --pd=http://pd-ip:2379 --changefeed-id=TICDC-XXX</code>

Conclusion

TiCDC is now reliable for most scenarios, having resolved early OOM and load‑impact issues through collaboration between 360 and PingCAP. Users with similar high‑write requirements are encouraged to adopt TiCDC and provide feedback to further improve its stability and efficiency.

KafkaTiDBdata synchronizationTiCDCChange Data Capture
Xiaolei Talks DB
Written by

Xiaolei Talks DB

Sharing daily database operations insights, from distributed databases to cloud migration. Author: Dai Xiaolei, with 10+ years of DB ops and development experience. Your support is appreciated.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.