Databases 13 min read

Sync MySQL to ClickHouse Using QingCloud MySQL Plus & MaterializeMySQL

This article explains how to break the barrier between heterogeneous databases by synchronizing MySQL to ClickHouse using QingCloud MySQL Plus and the MaterializeMySQL engine, covering MySQL replication history, architecture details, HTAP scenarios, and providing practical code examples for setup and operation.

Qingyun Technology Community
Qingyun Technology Community
Qingyun Technology Community
Sync MySQL to ClickHouse Using QingCloud MySQL Plus & MaterializeMySQL

Background

MySQL has been the world’s most popular OLTP database since 2000, while ClickHouse has recently attracted great attention as an OLAP database. This article shows how to synchronize MySQL data to ClickHouse.

1. MySQL Replication Development

MySQL 3.23 (2001) introduced asynchronous replication, which was not suitable for production. MySQL 5.7.2 (2013) added enhanced semi‑sync replication, making it marginally enterprise‑ready. MySQL 5.7.17 (2016) introduced MGR, evolving into a financially‑grade solution.

2. QingCloud MySQL Plus

QingCloud MySQL Plus provides strong consistency and high availability. Its architecture uses a Xenon component based on a Raft‑like algorithm to manage MySQL election and health checks, while data synchronization still relies on Semi‑Sync Replication or MGR, achieving strong consistency, automatic leader election, and cross‑region disaster recovery.

Key features:

Multi‑replica synchronization ensures at least one replica stays fully consistent with the primary, providing financial‑grade strong consistency.

Automatic, decentralized leader election with sub‑second failover.

Cross‑zone deployment for disaster recovery.

3. ClickHouse Synchronizing MySQL Data

To accelerate OLAP queries, QingCloud MySQL Plus leverages ClickHouse for data synchronization.

3.1 ClickHouse Overview

ClickHouse is a column‑oriented DBMS for online analytical processing (OLAP). Developed in 2008 for Yandex Metrica, it became open‑source in 2016 and has seen rapid adoption.

3.2 MySQL Table Engine

The original ClickHouse MySQL Table Engine maps individual MySQL tables, fetching schema from information_schema and pulling data on query execution.

Mapping to MySQL table

Fetch table structure from MySQL

Fetch data from MySQL during query

Limitations: only table‑level mapping and potential network/read pressure on MySQL during queries.

3.3 MySQL Database Engine

The MySQL Database Engine provides database‑level mapping, fetching all table structures from information_schema . It still suffers from the same data‑transfer drawbacks as the Table Engine.

3.4 Third‑Party Software Synchronization

Tools like Canal or Kafka can parse MySQL binlogs and write to ClickHouse, offering controllable sync flows but increasing complexity and operational overhead.

4. MaterializeMySQL Engine

To address the remaining issues, QingCloud ClickHouse team developed the MaterializeMySQL engine, an experimental feature merged into ClickHouse 20.8. It maps MySQL databases and consumes binlog events into ClickHouse’s MergeTree.

4.1 Creating a MaterializeMySQL Database

CREATE DATABASE test ENGINE = MaterializeMySQL('172.17.0.3:3306', 'demo', 'root', '123');

Parameters:

172.17.0.3:3306 – MySQL address and port

demo – MySQL database name

root – MySQL sync user

123 – User password

4.2 Design Workflow

Check MySQL variables

Select historical data

Consume new data

4.3 Function Process

CheckMySQLVars validates required MySQL settings (e.g., log_bin=ON, binlog_format=ROW, etc.).

SHOW VARIABLES WHERE (Variable_name = 'log_bin' AND upper(Value) = 'ON')
OR (Variable_name = 'binlog_format' AND upper(Value) = 'ROW')
OR (Variable_name = 'binlog_row_image' AND upper(Value) = 'FULL')
OR (Variable_name = 'default_authentication_plugin' AND upper(Value) = 'MYSQL_NATIVE_PASSWORD')
OR (Variable_name = 'log_bin_use_v1_row_events' AND upper(Value) = 'OFF');

prepareSynchronized initializes GTID, cleans outdated tables, pulls historical data, and establishes the binlog channel.

std::optional<MaterializeMetadata> MaterializeMySQLSyncThread::prepareSynchronized() {
    connection = pool.get();
    MaterializeMetadata metadata(connection, ...);
    if (!metadata.need_dumping_tables.empty()) {
        Position position;
        position.update(metadata.binlog_position, metadata.binlog_file, metadata.executed_gtid_set);
        metadata.transaction(position, [&]() {
            cleanOutdatedTables(database_name, global_context);
            dumpDataForTables(connection, metadata, ...);
        });
    }
    connection->query("COMMIT").execute();
}

Synchronized reads binlog events and converts them to ClickHouse statements, flushing buffers to disk.

client.connect();
client.startBinlogDumpGTID(...);
while (!isCancelled()) {
    BinlogEventPtr binlog_event = client.readOneBinlogEvent(...);
    if (binlog_event) onEvent(buffers, binlog_event, *metadata);
    if (!buffers.data.empty()) flushBuffersData(buffers, *metadata);
}

5. HTAP Application Scenario

By linking MySQL replication with ClickHouse analysis, MySQL Plus + ClickHouse can realize Hybrid Transactional/Analytical Processing (HTAP). ClickHouse can serve as an analytical instance within the MySQL replication topology, reducing reliance on read‑only MySQL replicas.

Current support includes MySQL 5.7 and 8.0; MySQL 5.6 is not supported. MaterializeMySQL is experimental and welcomes community contributions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ClickHouseMySQLHTAPData SynchronizationDatabase ReplicationMaterializeMySQL
Qingyun Technology Community
Written by

Qingyun Technology Community

Official account of the Qingyun Technology Community, focusing on tech innovation, supporting developers, and sharing knowledge. Born to Learn and Share!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.