Big Data 71 min read

Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tips

This comprehensive guide explains Elasticsearch’s fundamentals, including its distributed architecture, indexing process, shard and replica mechanisms, query execution, near‑real‑time search, segment management, and practical optimization techniques, providing developers and engineers with the knowledge needed to design, operate, and troubleshoot large‑scale search clusters.

Intelligent Backend & Architecture

Apr 23, 2021

Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tips

1. Introduction to Elasticsearch

Elasticsearch is an open‑source, distributed search and analytics engine built on Apache Lucene. It provides a simple RESTful API for full‑text search, real‑time indexing, and horizontal scalability across multiple nodes.

2. Core Architecture

Elasticsearch clusters consist of one or more nodes that share a common cluster name . Nodes can serve as master‑eligible (node.master:true) and/or data (node.data:true) nodes. The master node manages cluster state, index creation, shard allocation, and node discovery.

Discovery is performed by the built‑in Zen Discovery module, which uses unicast hosts or multicast to find other nodes and elect a master based on the smallest lexical node ID.

3. Shards and Replicas

Each index is divided into a fixed number of primary shards . Every primary shard can have zero or more replica shards that provide redundancy and increase read throughput. Primary and replica shards never reside on the same node, ensuring high availability.

Shard allocation follows a routing formula: shard = hash(routing) % number_of_primary_shards. By default, the document ID is used as the routing value.

4. Indexing Process

When a document is indexed, it is first written to an in‑memory buffer and to the transaction log (translog) . Periodically (default 1 s) the buffer is refreshed , creating a new immutable segment that is stored in the OS page cache and becomes searchable. When the translog grows beyond a threshold (default 512 MB) or after a configurable time, a flush operation writes a commit point to disk, clears the translog, and makes the data durable.

PUT /my_index/_doc/1
{
  "user": "alice",
  "message": "Hello Elasticsearch",
  "@timestamp": "2023-01-01T12:00:00Z"
}

5. Near Real‑Time Search (NRT)

After a refresh, newly indexed documents are visible to search within about one second. The _refresh API can be called manually to make changes visible immediately. The combination of in‑memory buffers, OS cache, and segment files enables NRT behavior without costly fsync on every write.

6. Query Execution

Search requests are handled in two phases:

Query phase : each shard executes the query locally, builds a priority queue of the top from + size hits, and returns document IDs and scores to the coordinating node.

Fetch phase : the coordinating node retrieves the full source of the selected documents from the relevant shards.

Elasticsearch supports various query types, including match, term, range, and bool. For sorting and aggregations, doc values (column‑oriented storage) are used instead of field data to avoid heap pressure.

7. Segment Merging

Because each refresh creates a new segment, Elasticsearch periodically merges small segments into larger ones in the background. Merges reclaim space from deleted documents and improve search performance by reducing the number of segments each shard must query.

8. Consistency and Write Guarantees

The consistency parameter (one, quorum, all) controls how many shard copies must be active before a write is acknowledged. The default quorum requires a majority of primary and replica shards to be available.

9. Common Pitfalls

Avoid deep pagination with from / size; use scroll or search_after for large result sets.

Beware of fielddata memory usage on sorting or aggregations; prefer doc_values.

Set appropriate refresh_interval and bulk size to balance indexing throughput and search latency.

10. Interview Highlights

Typical interview questions cover Elasticsearch’s indexing pipeline, shard routing, master election, near‑real‑time behavior, and underlying data structures such as Lucene’s inverted index, Block k‑d trees for numeric fields, and LSM‑tree based storage for keyword fields.

11. Sample Configuration

# Disable refresh during bulk load
index.refresh_interval: -1
# Set number of replicas to 0 while loading
number_of_replicas: 0

After loading, restore the settings and perform a _flush to commit data.

12. Images

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Indexing Search Engine near real-time

Written by

Intelligent Backend & Architecture

We share personal insights on intelligent, automated backend technologies, along with practical AI knowledge, algorithms, and architecture design, grounded in real business scenarios.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.