Backend Development 15 min read

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

This article explains how Elasticsearch uses inverted indexes, term dictionaries, and compression techniques such as Frame‑of‑Reference and Roaring Bitmaps to deliver rapid full‑text search, efficient storage, and fast union queries, while also offering practical indexing tips for production use.

Efficient Ops
Efficient Ops
Efficient Ops
How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

Recently I have worked on several projects that use Elasticsearch (ES) for data storage and search analysis, and I compiled this technical sharing based on my learning.

The focus is on "How ES achieves fast retrieval" rather than its distributed architecture or API usage.

About Search

Imagine searching for ancient poems containing the character "前". Using a traditional relational database, you would write a SQL query like:

<code>select name from poems where content like "%前%";</code>

This sequential scan scans all records, which is inefficient for search scenarios.

Search engines like ES address this by building inverted indexes.

Search Engine Principles

Content crawling and stop‑word filtering

Tokenization to extract keywords

Building an inverted index from keywords

User query processing

The core concept introduced here is the inverted index, which ES implements via Lucene.

Inverted Index

After building the inverted index, a query for "前" can directly locate matching poems.

Key concepts:

term : the keyword in ES

postings list : list of document IDs containing the term

Document IDs are stored as integers for efficient compression.

Each shard contains multiple segments; each segment can hold up to 2^31 documents, assigning a unique integer ID to each.

To quickly locate a term among millions, ES uses a term dictionary and a term index (a trie‑like structure). The term dictionary is stored on disk in blocks with prefix compression, while the term index resides in memory as a Finite State Transducer (FST), offering small space usage and O(len(str)) lookup time.

Techniques for Postings List

Two main challenges are storage size and fast union/intersection queries.

Compression (FOR – Frame of Reference) : Postings lists are sorted integer arrays, allowing delta encoding. ES groups documents into blocks (e.g., 256 docs per block) and stores each block with a header indicating the bit width needed for IDs, dramatically reducing space.

Roaring Bitmaps (filter cache) : For filter queries, ES caches results using Roaring Bitmaps, which efficiently represent sparse sets of document IDs and support fast bitwise AND/OR operations.

Union Queries

If a filter cache exists, the union/intersection is performed directly on the bitmap. Otherwise, ES uses a skip‑list approach to traverse postings lists on disk, skipping irrelevant blocks and avoiding unnecessary decompression.

Summary

ES uses inverted indexes to locate target documents quickly, trading higher space consumption for significantly improved search performance.

Term index, term dictionary, and postings list together enable fast term lookup while minimizing memory and disk I/O.

FOR compression reduces postings list storage size and speeds up queries.

Filter queries leverage Roaring Bitmaps for cache efficiency and low memory usage.

Union queries use bitmaps when cached; otherwise, skip‑list traversal reduces CPU cost.

Elasticsearch Indexing Tips

Explicitly disable indexing for fields that do not need it, as ES indexes fields by default.

For string fields that do not require analysis, also disable analysis explicitly.

Prefer monotonically increasing IDs over highly random IDs (e.g., UUIDs) to improve query performance.

search engineElasticsearchLuceneInverted IndexcompressionRoaring BitmapPostings List
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.