Operations 9 min read

Mastering Elasticsearch Data Sync and Cluster Architecture: 3 Strategies Explained

This article explains three Elasticsearch data‑synchronization methods, compares their pros and cons, and then dives into ES cluster structure, node roles, shard allocation, distributed queries, split‑brain handling, and fault‑tolerance mechanisms, providing a comprehensive guide for developers and ops engineers.

Raymond Ops
Raymond Ops
Raymond Ops
Mastering Elasticsearch Data Sync and Cluster Architecture: 3 Strategies Explained

This chapter shares Elasticsearch (ES) data synchronization solutions and ES cluster knowledge.

1. Elasticsearch Data Synchronization

1.1 Data sync issue

Elasticsearch stores hotel data from a MySQL database, so any change in MySQL must be reflected in Elasticsearch. In microservice architectures, the service that manages hotels (operating MySQL) and the service that provides hotel search (operating Elasticsearch) may run in separate services, raising the question of how to keep data synchronized.

1.2 Synchronization approaches

Solution 1: Synchronous call

Image
Image

Solution 2: Asynchronous notification

Image
Image

Solution 3: Binlog listener

Image
Image

1.3 Comparison of the three solutions

Solution 1 – Synchronous call : Pros : simple and straightforward. Cons : high service coupling.

Solution 2 – Asynchronous notification : Pros : low coupling, moderate implementation difficulty. Cons : depends on the reliability of the message queue.

Solution 3 – Binlog listener : Pros : completely decouples services. Cons : enables binlog adds load to the database and is complex to implement.

2. Elasticsearch Cluster

2.1 Cluster structure

A single‑node Elasticsearch faces two problems: massive data storage and a single point of failure. Elasticsearch solves these by splitting an index into N shards stored across multiple nodes and replicating each shard on other nodes.

Image
Image
Image
Image

2.2 Building a cluster

The number of shards and replicas is defined when creating an index and cannot be changed later. Example syntax:

PUT /itcast
{
  "settings": {
    "number_of_shards": 3, // shard count
    "number_of_replicas": 1 // replica count
  },
  "mappings": {
    "properties": {
      // mapping definitions ...
    }
  }
}

Detailed cluster setup is omitted.

2.3 Node roles

master eligible : can become the master node, managing cluster state, shard allocation, and handling index creation/deletion.

data : stores data and handles search, aggregation, and CRUD operations.

ingest : preprocesses documents before they are indexed.

coordinating : routes requests to appropriate nodes, merges results, and returns them to the client.

2.4 Distributed query

Each node role has distinct responsibilities; it is recommended that each node runs a single dedicated role in a production cluster.

Image
Image

2.5 Split‑brain scenario

When all nodes are master‑eligible, a master failure triggers a new election. If network partitions occur, two masters may be elected, causing a split‑brain. To avoid this, a majority of votes (eligible nodes + 1 / 2) is required. The setting

discovery.zen.minimum_master_nodes

(default in ES 7+) enforces this.

Image
Image
Image
Image
Image
Image
Image
Image

2.6 Quick recap

Master‑eligible nodes participate in master election and manage cluster state.

Data nodes store and query data.

Coordinating nodes route requests and merge results.

2.7 Distributed storage

When a new document is added, the coordinating node determines the target shard using a hash of the routing value (default document ID):

shard = hash(_routing) % number_of_shards
_routing

defaults to the document ID.

The algorithm depends on the shard count, which cannot be changed after index creation.

Image
Image

2.8 Distributed query phases

Scatter phase : the coordinating node distributes the query to each shard.

Gather phase : the coordinating node collects results from data nodes, merges them, and returns the final result set.

Image
Image

2.9 Fault tolerance

The master node monitors node health; if a node crashes, its shards are automatically relocated to other nodes, ensuring data safety.

Image
Image
Image
Image
Image
Image
Image
Image
Image
Image

After a master failure, an eligible master is elected, and the cluster rebalances shard allocation.

distributed systemsoperationsElasticsearchdata synchronizationcluster management
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.