Databases 13 min read

Mastering MySQL‑to‑Elasticsearch Sync: 4 Strategies & Top Migration Tools

This guide compares four MySQL‑to‑Elasticsearch synchronization methods—synchronous dual‑write, asynchronous MQ‑based dual‑write, timer‑driven SQL extraction, and real‑time Binlog replication—and reviews popular CDC tools such as Canal, Alibaba Cloud DTS, Databus, and others to help you choose the right solution.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Mastering MySQL‑to‑Elasticsearch Sync: 4 Strategies & Top Migration Tools

1. Introduction

In many projects MySQL serves as the primary business database while Elasticsearch (ES) is used for query workloads to achieve read‑write separation and handle massive, complex queries.

The key challenge is keeping MySQL and ES data synchronized. Below are four common synchronization approaches.

2. Synchronization Approaches

2.1 Synchronous Dual‑Write

Write to MySQL and ES simultaneously.

Advantages:

Simple business logic

High real‑time performance

Disadvantages:

Hard‑coded; every MySQL write must also contain ES code

Strong coupling between business and search layers

Risk of data loss if one write fails

Performance degradation due to extra ES writes

2.2 Asynchronous Dual‑Write (via MQ)

Use a message queue to decouple writes from multiple data sources.

Advantages:

High performance

Reduced data‑loss risk thanks to MQ consumption guarantees

Isolated source writes, easier to add new sources

Disadvantages:

Hard‑coded integration for new sources

Increased system complexity with a message broker

Potential latency because of asynchronous consumption

2.3 SQL‑Based Extraction (Timer)

For scenarios with lower real‑time requirements, a timer extracts changed rows based on a timestamp column.

Add a timestamp column that updates on any CRUD operation.

Leave existing CRUD code unchanged.

Run a scheduled job that scans the table for rows changed within the interval.

Write each changed row to ES.

Advantages:

No code intrusion, no hard‑coding.

No strong business coupling; original performance unchanged.

Simple worker implementation.

Disadvantages:

Lower timeliness; fixed polling interval introduces delay.

Additional load on the database; can be mitigated by polling a replica.

Classic solution: use Logstash to periodically query MySQL via SQL and write incremental changes to ES.

2.4 Real‑Time Binlog Synchronization

Leverages MySQL’s binary log to achieve real‑time, non‑intrusive sync.

Typical steps:

Read Binlog entries for the target tables.

Convert entries to MQ messages.

Implement an MQ consumer.

Consume each message and write it to ES.

Advantages:

No code intrusion or hard‑coding.

Existing systems remain unchanged.

High performance and business decoupling.

Disadvantages:

Complexity of building a Binlog‑based system.

If MQ is used, it inherits the same latency risk as the asynchronous approach.

3. Data Migration Tool Selection

The Binlog real‑time method is the most widely adopted, spawning several migration tools that subscribe to Binlog changes (CDC).

3.1 Canal

Parses database incremental logs and provides subscription/consumption, primarily for MySQL.

Canal pretends to be a MySQL slave, receives the master’s Binlog via the dump protocol, parses it into JSON, and forwards it to ES via TCP or MQ.

Core workflow: Binlog Parser extracts and pushes events; EventSink filters, routes, and processes data.

3.2 Alibaba Cloud DTS

Data Transmission Service supports migration, real‑time subscription, and synchronization across RDBMS, NoSQL, and OLAP sources.

Features include multi‑source support, high performance (up to 70 MB/s, 200 k TPS), high availability, and a visual management console. It is a paid service.

3.3 Databus

LinkedIn’s open‑source CDC system captures changes from MySQL and Oracle with low latency, supports transactional consistency, and offers unlimited replay for lagging consumers.

3.4 Other Tools

Flink : Distributed stream processing engine for bounded and unbounded data streams.

CloudCanal : Commercial data sync/migration product.

Maxwell : Emits row‑level changes as JSON without client code.

DRD : Alibaba’s distributed database middleware.

yugong : Migrates data from Oracle to MySQL.

4. Conclusion

This article presented four MySQL‑to‑ES synchronization strategies and introduced several popular migration tools to help you choose the most suitable solution.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ElasticsearchmysqlBinlogCanaldata synchronizationDTSCDC
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.