How Elasticsearch Powers Real-Time Search: Inverted Index, Sharding, and Write Mechanics

This article explains Elasticsearch’s core concepts—including inverted indexes, shard architecture, node roles, and the detailed write‑read‑search workflow—so readers can grasp how the system achieves near‑real‑time search and reliable data storage.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
How Elasticsearch Powers Real-Time Search: Inverted Index, Sharding, and Write Mechanics

Understanding Inverted Index

Elasticsearch uses an inverted index similar to the structures employed by search engines and distributed systems.

Forward Index vs. Inverted Index

Forward vs Inverted Index
Forward vs Inverted Index

Inverted Index Components

Term Dictionary: records all terms and maps each term to its posting list (often implemented with B+ trees or hash chains for high‑performance insert and lookup).

Posting List: consists of entries that store

Document ID

Term Frequency (TF)

Position (for phrase queries)

Offset (for highlighting)

Inverted Index Structure
Inverted Index Structure

Elasticsearch’s Inverted Index

Each JSON field in a document has its own inverted index.

Indexing can be disabled for specific fields, saving storage but making the field unsearchable.

Distributed Architecture Principles

Sharding

Primary shard: each shard has one primary copy.

Replica shard: copies of the primary shard for redundancy.

Deploy an ES cluster on three machines (esnode1, esnode2, esnode3).

Create an Index with 3 Shards and 1 Replica

PUT /sku_index/_settings
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  }
}

Response:
{
  "acknowledged": true
}
Cluster Nodes
Cluster Nodes

The cluster elects a master node (e.g., esnode2) to manage metadata and shard allocation.

Master node: handles metadata, shard promotion, and replica management.

Data node: stores actual shard data.

Node Failure Scenarios

If the master node fails, a new master is elected.

If a data node fails, its primary shards are promoted from replicas on other nodes.

Write Process

Steps to Write a Single Document

The client sends the request to a coordinating node.

The coordinating node routes the request to the primary shard based on the document ID.

The primary shard writes the document and forwards the operation to replica shards; once all replicas acknowledge, the coordinating node returns success.

When all replicas report success, the client receives a successful response, indicating the write is durable on both primary and replicas.

Write Flow
Write Flow
Tips

: The client’s success response means the write has been completed on the primary shard and all its replicas.

Underlying Write Mechanics

Write Internals
Write Internals

Writing involves three main operations:

Write New Document : data is written to memory and appended to the translog file.

Refresh : every second, in‑memory segments are flushed to the filesystem cache, making the data searchable (near real‑time search).

Flush : every 30 minutes or when the translog reaches 512 MB, segments are written to disk and the translog is cleared. The translog records all operations between flushes, enabling recovery after failures.

Read Process

Steps to Read a Document

The client contacts a coordinating node.

The coordinating node routes the request to a shard (primary or replica) that holds the document.

The shard returns the document to the coordinating node, which forwards it to the client.

Read Flow
Read Flow

If a replica has not yet received the latest write, a read from that replica may report the document as missing, while a read from the primary succeeds.

Search Process

Search Data Flow

The client sends a query to a coordinating node.

The coordinating node forwards the query to all relevant shards (primary or replica).

Each shard returns its top matching document IDs (query phase).

The coordinating node merges, sorts, and paginates the results.

In the fetch phase, the coordinating node retrieves the full documents from the shards based on the IDs.

Search Flow
Search Flow

Example: with three shards, each returns its top 10 hits; the coordinating node merges the 30 results and returns the final top 10.

Delete/Update Mechanics

Delete: a .del file marks a document as deleted; searches consult this file to filter out deleted docs.

Update: the old document is marked deleted and a new document is written.

Underlying Logic

Each refresh creates a new segment file (default 1 second interval).

Merge operations combine multiple segment files, physically remove deleted docs, write a new segment, and record a commit point.

Source: juejin.cn/post/7110610301669605383

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

search enginedistributed architectureElasticsearchshardinginverted indexWrite Process
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.