Databases 12 min read

Understanding Deep Pagination Performance Issues in Elasticsearch and Its Distributed Architecture

This article explains why deep pagination in Elasticsearch leads to high CPU usage, introduces core concepts such as indices, shards, nodes, and documents, and details the distributed CRUD operations, query and fetch phases, and segment management that affect search performance.

Top Architect

Jun 20, 2020

Deep Pagination Performance Issue

When performing deep pagination in an Elasticsearch cluster, each shard must generate the top N results for the requested page, causing the coordinating node to sort a massive number of results and discard most of them, which dramatically increases CPU and memory consumption.

Why Deep Pagination Is Problematic

For example, requesting page 1 (results 1‑10) from an index with 5 primary shards requires each shard to return its top 10 results, which are then merged to select the final 10. Requesting page 1000 (results 10001‑10010) forces each shard to produce the top 10010 results, resulting in sorting 50,050 entries and discarding 50,040, illustrating the exponential resource growth with deeper pages.

Basic Concepts

Index

An index in Elasticsearch is analogous to a database; it stores documents and can be referenced by a lowercase name.

Type

Types act as logical partitions within an index, similar to tables in relational databases.

Document

Documents are JSON objects representing the atomic unit of data stored and searchable in an index.

Node

A running Elasticsearch instance is a node; a cluster consists of one or more nodes sharing the same cluster.name.

Master node: manages cluster-wide changes.

Data node: stores data and inverted indexes.

Coordinating node: routes client requests and balances load.

Shard

Data in an index is divided into primary and replica shards; each shard is a Lucene instance that handles its own indexing and search.

Distributed Document CRUD

Create (Index New Document)

The coordinating node determines the target shard for the new document, forwards the request to the primary shard, which then replicates it to its replicas.

Update and Delete

Updates are performed by indexing a new version of the document and marking the old version as deleted; deletions are recorded in a del file and only physically removed during segment merges.

Read (Search)

Search consists of a query phase, where each shard builds a priority queue of from + size results, and a fetch phase, where the coordinating node retrieves the actual documents from the relevant shards.

Segment Management

New documents are first written to memory and a transaction log ( translog); every second a refresh creates a new segment in the filesystem cache. Periodic fsync writes segments to disk and clears the translog. Over time many small segments are merged into larger ones to improve search efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Indexing distributed architecture Elasticsearch Sharding search performance deep pagination

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.