Backend Development 9 min read

Mastering Elasticsearch Distributed Search: Performance Tips & Pagination Strategies

This article examines Elasticsearch’s distributed search architecture, explains the two‑phase query and fetch process, identifies performance and relevance scoring challenges, and presents optimization techniques such as Search After with point‑in‑time, Scroll API usage, and DFS query‑then‑fetch for accurate scoring.

MaGe Linux Operations

Mar 20, 2021

Mastering Elasticsearch Distributed Search: Performance Tips & Pagination Strategies

Elasticsearch is a Lucene‑based search server that provides a distributed, multi‑user full‑text search engine capable of analyzing and mining massive data sets, offering scalability and near‑real‑time search.

The article analyzes Elasticsearch’s distributed search mechanism, compares different search types, and proposes optimization solutions.

During a distributed search, a request is sent to all relevant shards and the results are aggregated. The process consists of two stages: Query and Fetch. For example, a cluster with 2 primary shards and 1 replica results in 4 shard copies.

Query Phase

1. The user’s request reaches a node, which acts as the coordinating node and randomly selects two shard copies to form a complete data set, forwarding the request to the corresponding data nodes.

2. Each shard searches its documents, computes relevance scores based on term and document frequencies, sorts the results, and returns only the from+size top documents (without full content) to the coordinating node.

Fetch Phase

The coordinating node merges the sorted results from all shards, selects the final from+size documents, and then retrieves the full document data from each shard by ID before returning them to the client.

Performance Issues

Each shard must return from+size documents, so the coordinating node processes number_of_shards × (from+size) documents. Deep pagination (large from+size) increases memory usage and can exceed the default 10,000‑document limit, putting pressure on the coordinating node.

Scoring Issues

Relevance scores are calculated per shard using local term and document frequencies, then merged globally. Inconsistent scoring across shards, especially with many primary shards, leads to inaccurate relevance ranking.

Optimization for Performance (Deep Pagination)

Elasticsearch offers two methods:

Search After : Uses the sort value of the last hit to fetch the next page. Requires consistent query and sort parameters across pages. To avoid snapshot changes during refreshes, use a Point‑In‑Time (PIT) to keep the index state stable.

PIT Example : POST /my-index-000001/_pit?keep_alive=1m The API returns a PIT ID, which must be included in subsequent queries along with a unique sort field (e.g., document ID). Each response includes a sort value that is used for the next page. After completing the pagination, delete the PIT with DELETE /_pit.

Limitations: Search After does not support the from parameter, cannot jump to arbitrary pages, and only allows forward navigation.

Scroll API (Large Result Sets)

The Scroll API creates a snapshot of the initial query and uses a _scroll_id for subsequent requests. Specify a scroll parameter (e.g., scroll=1m) to keep the snapshot alive. GET /_search?scroll=1m&... When only document IDs are needed, set sort=_doc for faster retrieval. Update the scroll’s keep‑alive time with POST /_search/scroll. After use, clear the scroll with DELETE /_search/scroll. The default maximum number of open scroll contexts is 500, configurable via search.max_open_scroll_context.

Limitation: New documents added after the snapshot are not visible to the scroll.

Choosing the Right Search Type

Ordinary from+size query: suitable for retrieving only the top‑ranked documents.

Search After: ideal for deep pagination scenarios.

Scroll: best when a single request needs to retrieve a large volume of documents.

Scoring Improvements

1. Adjust the number of primary shards: use a single primary shard for small datasets; distribute documents evenly across shards for large datasets.

2. Use DFS Query Then Fetch : shards first collect term and document frequencies, then perform a unified relevance scoring, yielding the most accurate results, though at higher CPU and memory cost.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Elasticsearch pagination Distributed Search search_after scroll API

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.