Databases 14 min read

Elasticsearch Cluster and Data Layer Architecture Overview

This article explains Elasticsearch’s cluster and data layer architecture, covering nodes, indices, shards, replicas, mixed and tiered deployment models, storage options, and the trade‑offs of different distributed system designs for scalable systems.

Top Architect
Top Architect
Top Architect
Elasticsearch Cluster and Data Layer Architecture Overview

Elasticsearch is a widely used open‑source search and analytics engine. The article introduces its overall architecture, focusing on the cluster layer and the data layer.

Cluster layer concepts : a node is a running Elasticsearch process, an index is a logical collection of documents, a shard is a partition of an index managed by a node, and a replica is a copy of a shard that provides redundancy and read scalability.

The following diagram illustrates a typical cluster layout:

Indexing process : when a document is indexed, it is routed to the primary shard, indexed there, then replicated to its replica shards; the operation succeeds only after all replicas acknowledge.

Role deployment methods :

Mixed deployment (default): master, data, and transport roles coexist on the same node, simplifying setup but causing resource contention and connection‑scaling limits.

Tiered deployment: transport nodes handle request routing and aggregation, while dedicated data nodes store and process shards, improving isolation, scalability, and allowing hot‑updates without downtime.

Data layer architecture : Elasticsearch stores index and metadata on the local file system, supporting various I/O types (niofs, mmap, simplefs, smb). Replicas are used to protect against node failures and to increase query throughput.

Replica benefits:

High availability – traffic can be redirected to remaining replicas if one fails.

Data reliability – prevents data loss when the primary node crashes.

Improved query capacity – adding replicas scales read performance proportionally.

Replica drawbacks include additional resource cost, write performance overhead, and slower scaling when increasing replica count.

Two distributed system designs :

1. Local file‑system based – each shard stores data locally; recovery after a node failure requires copying whole shard data, which can be time‑consuming.

2. Shared storage based – compute nodes access a distributed file system (e.g., HDFS); shards contain only computation logic, enabling rapid node replacement and independent scaling of storage and compute.

Both approaches have trade‑offs: local storage offers lower latency but higher recovery cost, while shared storage provides elasticity at the expense of potential I/O overhead.

Conclusion : Elasticsearch’s architecture exemplifies a distributed data system with flexible deployment options; choosing between mixed or tiered roles and between local or shared storage depends on specific reliability, performance, and scalability requirements.

distributed systemsSearch EngineElasticsearchShardingReplicaCluster Architecture
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.