Databases 14 min read

Introducing DTLE: An Open‑Source MySQL Data Transfer Middleware for CDC, Replication, and Cloud Synchronization

The article presents DTLE, an open‑source MySQL data‑transfer middleware that extends replication capabilities with high‑performance CDC, multi‑topology support, cloud‑to‑cloud synchronization, and robust cluster management, while comparing it with other open‑source solutions and showcasing real‑world demos.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Introducing DTLE: An Open‑Source MySQL Data Transfer Middleware for CDC, Replication, and Cloud Synchronization

Overview

This talk, originally delivered by Hong Bin at the 3306π technical meetup in Wuhan, introduces DTLE (Data‑Transformation‑le), an open‑source CDC tool released on Programmer's Day (Oct 24) that aims to address the limitations of MySQL replication for heterogeneous data‑store environments.

MySQL Replication Recap

MySQL replication works by streaming binlog events from a primary instance to a replica, where the replica's I/O thread writes events to a relay log and the SQL thread replays them. While widely used for high‑availability and read‑write splitting, it suffers from insufficient filtering (only database/table level), high storage overhead, limited topology flexibility, and is primarily designed for HA rather than complex data‑migration scenarios.

DTLE Core Scenarios

DTLE targets several use cases that go beyond traditional replication: remote multi‑active deployments, data aggregation and distribution across databases, real‑time data subscription via Kafka, and online data migration with minimal downtime.

Design Principles

DTLE is built around two key principles: ease of use (simple deployment without external dependencies) and reliability (distributed architecture with automatic failover and metadata consistency).

Architecture

DTLE consists of two process roles: a Manager that stores metadata, receives and dispatches jobs, and monitors agents; and Agents that handle binlog extraction, filtering, compression, transmission, and replay. Jobs are defined in JSON and submitted via HTTP to the Manager, which assigns them to available agents.

Cluster Mechanism

Multiple Manager nodes form a Raft‑based consensus group for metadata replication and leader election. Worker agents report health status, and the leader reassigns tasks if an agent becomes unresponsive.

Supported Topologies

DTLE supports various topologies, including simple 1‑to‑1 sync, n‑to‑1 aggregation, and 1‑to‑n distribution, as well as cross‑data‑center bidirectional sync with link compression to reduce bandwidth usage.

Technology Stack

The system is implemented in Go and leverages open‑source components such as HashiCorp Nomad (cluster scheduling), Consul (distributed KV store), Serf (gossip‑based node health detection), and NATS (lightweight messaging).

Key Features

Clustered deployment with automatic failover

Binlog and SQL replay modes

Parallel replay using MySQL 5.7 logical timestamps

Incremental checkpointing

Full and incremental sync

Database, table, and row‑level filtering

Link compression and cross‑network support

Automatic table creation and DDL handling

Limitations

Supports only MySQL 5.6/5.7 (InnoDB)

Requires GTID and specific binlog settings

Limited character set support

No trigger or custom authentication support

Comparison with Similar Tools

Compared with Debezium, StreamSets, and Otter, DTLE offers full‑load plus incremental sync, global metadata consistency without global locks, multi‑level data filtering, bidirectional GTID tracking, and both single‑node and clustered deployment options.

Demo and Cloud‑Sync Case

Demo scripts showcase one‑way sync, table‑level aggregation, data distribution, and cross‑IDC bidirectional replication. A cloud‑sync benchmark synchronizes ~1 billion rows between Alibaba Cloud RDS and JD Cloud RDS across regions, achieving >1000 rows/s after a 5‑hour full load.

Resources

GitHub repositories for DTLE source code, demo scripts, and PPT slides are provided, along with an invitation to join the DTLE technical community for support.

distributed systemsmysqlopen-sourcedata replicationCDCDTLEcloud sync
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.