How Elasticsearch’s Cluster Architecture Powers Scalable Search and Analytics
This article explains Elasticsearch’s distributed cluster design, covering nodes, indices, shards, replicas, deployment models, data storage options, and the trade‑offs of different distributed system architectures for search and analytics workloads.
Elasticsearch Cluster Architecture
Elasticsearch is a widely used open‑source search and analytics system that excels in search, JSON document storage, and time‑series data analysis.
Key Concepts
Node: a running Elasticsearch instance, typically a process on a machine.
Index: a logical collection that includes mapping and inverted/forward index files; data may be spread across one or many machines.
Shard: a partition of an index managed by a node; each index is divided into primary and replica shards for scalability and reliability.
Replica: a copy of a shard that ensures strong or eventual consistency and helps balance read traffic.
Index Process
When creating an index, a document is routed to the primary shard, indexed there, then replicated to its replica shards before the operation is acknowledged.
If a primary or replica shard fails, the missing shard is rebuilt from remaining replicas, during which the system operates in a degraded state until failover completes.
Role Deployment Methods
Elasticsearch supports two deployment styles:
Mixed deployment (default): Data and transport roles run on the same node, simplifying setup but causing resource contention and connection‑scaling limits.
Tiered deployment : Separate transport nodes handle request routing and result merging, while dedicated data nodes store and process data, improving isolation, scalability, and enabling hot upgrades.
Elasticsearch Data Layer Architecture
Data Storage
Indices and metadata are stored on the local file system, with loading options such as NIOFS, mmap, simplefs, and SMB; mmap offers the best performance.
Replica
Each index can be configured with a replica count. For example, a replica count of 2 yields one primary shard and two replica shards, which the master tries to place on different machines or racks.
Replica serves three purposes: ensuring service availability, guaranteeing data reliability, and increasing query capacity.
Issues
Replica introduces additional resource cost; unused replicas waste capacity.
Write performance suffers because each write must be propagated to replicas.
Scaling replicas dynamically can be slow due to full data copying.
Distributed System Types
Type 1: Local File‑System Based Distributed System
Data resides on each node’s local disks. When a node fails, its primary shard is elected from a replica, and a new replica is created on another node, requiring full data copy.
Type 2: Distributed File‑System (Shared Storage) Based System
Storage and compute are separated: shards contain only computation logic, while actual data lives in a shared distributed file system (e.g., HDFS). A failed node can quickly attach to the shared storage without large data transfers.
Advantages include elastic resource scaling, finer‑grained management, and better hotspot handling; the main drawback is potentially lower I/O performance compared to local disks.
Conclusion
Both architectures have distinct strengths and weaknesses; choosing the right model depends on workload characteristics, reliability requirements, and operational constraints.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
