Big Data 17 min read

Master/Slave Architecture vs P2P Ring Structure and an Overview of Elasticsearch

This article explains the differences between Master‑Slave and P2P ring architectures, introduces Elasticsearch’s core concepts, internal components, master election, shard routing, indexing and search processes, and discusses how the system avoids split‑brain scenarios and ensures high availability.

Architect

May 16, 2020

The article begins by comparing the traditional Master‑Slave model, where a single master node manages slaves and can become a bottleneck, with a Peer‑to‑Peer (P2P) ring structure that has no central node, offering better scalability but lower controllability.

It then provides an overview of Elasticsearch (ES), describing it as a distributed, real‑time search and analytics engine built on Lucene, supporting full‑text, structured, and analytical queries.

Key ES concepts are listed, including clusters, nodes, indices, primary and replica shards, types, mappings, documents, fields, allocation, and gateways, explaining their analogues to relational database components.

The internal architecture is detailed: a gateway stores index files, a distributed Lucene layer handles indexing and searching, modules such as discovery, scripting, and plugins sit above, followed by transport and JMX layers, and finally RESTful and Java APIs for client interaction.

Service discovery and master election are handled by the ZenDiscovery module, which uses ping (multicast or unicast) to find nodes, selects a master based on lexical ordering of node IDs, and relies on the discovery.zen.minimum_master_nodes setting to prevent split‑brain situations.

Master election, split‑brain avoidance, and the role of quorum are explained, along with the impact of network partitions and the recommended quorum formula (N/2 + 1).

The article describes the default "Query Then Fetch" search flow, where the query is broadcast to all shard copies, a global ranking is built, and the top documents are fetched from the relevant shards.

Routing and replica consistency are covered: routing uses hash(routing) % number_of_primary_shards, and replicas provide fault tolerance. The write path involves a client request to a node, routing to the primary shard, indexing, and asynchronous replication to replica shards.

Shard allocation triggers include index creation, deletion, replica addition, and node changes. The relationship between shards and Lucene segments is explained, noting that segments are immutable, merged over time, and stored in various file types (.cfs, .fdt, .tim, .dvd, etc.).

Finally, the article lists reference links for further reading.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Search Engine Elasticsearch Master‑Slave P2P

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.