Backend Development 6 min read

How to Keep MySQL and Elasticsearch in Sync: 4 Practical Strategies

This article examines four common approaches for synchronizing product data from MySQL to Elasticsearch—synchronous dual write, asynchronous dual write with message queues, scheduled batch jobs, and binlog‑based data subscription—detailing their advantages, drawbacks, and implementation considerations.

ITPUB

Jan 30, 2023

How to Keep MySQL and Elasticsearch in Sync: 4 Practical Strategies

1. Synchronous Dual Write

The most straightforward method writes to MySQL and simultaneously writes the same data to Elasticsearch.

Pros: Simple to implement.

Cons: Business coupling, performance impact due to two writes, and limited extensibility for advanced search features.

2. Asynchronous Dual Write

Data is first placed onto a message queue (MQ) when a product is created; a dedicated search service consumes the messages and writes to Elasticsearch, decoupling the product service from synchronization logic.

Pros: Decouples services, near‑real‑time sync (seconds) via MQ.

Cons: Introduces additional components and complexity.

3. Scheduled Tasks

A periodic job reads changes from MySQL and pushes them to Elasticsearch. Choosing the right frequency is critical: high frequency can cause resource spikes, while low frequency reduces timeliness.

Pros: Easy to implement.

Cons: Hard to guarantee real‑time freshness; can create storage CPU/memory load spikes.

4. Data Subscription (Canal)

MySQL binlog can be subscribed to for master‑slave style replication. Frameworks like canal emulate a slave client, capture changes, and forward them to downstream systems.

Canal‑adapter provides ready‑made adapters, including an Elasticsearch adapter that can sync data with zero custom code. However, complex aggregations (e.g., wide tables) often still require custom client logic to aggregate data before indexing.

Pros: Low business intrusion and good real‑time performance.

Popular open‑source Canal adapters:

Cancal – Alibaba, Java, supports Kafka/RocketMQ, high availability.

Maxwell – Zendesk, Java, supports Kafka/RabbitMQ/Redis, high availability.

Python‑Mysql‑Replication – Community, Python, custom message format.

The same synchronization patterns apply when replicating MySQL to other stores such as HBase.

References:

https://www.infoq.cn/article/1afyz3b6hnhprrg12833

https://www.iamle.com/archives/2900.html

https://blog.51cto.com/lianghecai/4755693

https://qinyuanpei.github.io/posts/1333693167/

https://github.com/alibaba/canal/wiki/ClientAdapter

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development Elasticsearch mysql Message Queue Canal Data synchronization

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.