Mastering Elasticsearch Pagination: From/Size, Scroll, and Search After Explained

This article examines Elasticsearch's deep pagination challenges and presents three practical solutions—basic from/size, scroll API, and search after—detailing their execution phases, performance trade‑offs, code examples, and guidance on when to choose each method for efficient data retrieval.

Programmer DD
Programmer DD
Programmer DD
Mastering Elasticsearch Pagination: From/Size, Scroll, and Search After Explained

Elasticsearch is a real‑time distributed search and analytics engine that excels with large volumes of unstructured data, but it suffers from deep pagination issues similar to relational databases.

From + Size Pagination

The simplest pagination method mirrors SQL's LIMIT, using from (start offset) and size (page size). Example DSL:

GET /wms_order_sku/_search
{
  "query": {"match_all": {}},
  "from": 10,
  "size": 20
}

Elasticsearch processes a search in two phases:

Query phase – the coordinating node creates a priority queue of from+size entries, broadcasts the request to relevant shards, each shard fills its own queue, and the coordinating node merges them.

Fetch phase – the coordinating node retrieves the full documents for the selected IDs.

Scroll Pagination

Scroll works like a database cursor: the first request creates a snapshot and returns a scroll_id. Subsequent requests use this ID to fetch the next batch, reducing repeated sorting and query overhead.

Execution steps are similar to the from/size query, but the snapshot of IDs is kept on the coordinating node.

GET /wms_order_sku2021_10/_search?scroll=1m
{
  "query": {"bool": {"must": [{"range": {"shipmentOrderCreateTime": {"gte": "2021-10-04 00:00:00", "lt": "2021-10-15 00:00:00"}}}]},
  "size": 20
}
GET /_search/scroll
{
  "scroll": "1m",
  "scroll_id": "DnF1ZXJ5VGhlbkZldGNo..."
}

Search After Pagination

Introduced in ES 5, Search After records the sort values of the last hit and uses them as a cursor for the next request, allowing real‑time data changes to be reflected while avoiding the snapshot overhead of scroll.

It does not support arbitrary page jumps; each request fetches the next page based on the previous page's last document.

GET /wms_order_sku2021_10/_search
{
  "query": {"bool": {"must": [{"range": {"shipmentOrderCreateTime": {"gte": "2021-10-12 00:00:00", "lt": "2021-10-15 00:00:00"}}}]},
  "size": 20,
  "sort": [{"_id": {"order": "desc"}}, {"shipmentOrderCreateTime": {"order": "desc"}}],
  "search_after": ["SO-460_152-1447931043809128448-100017918838", 1634077436000]
}

Comparison and Recommendations

When to use each method:

From/Size – suitable for small result windows (<10,000 records) or when only the top‑N results are needed.

Scroll – ideal for large data migrations or batch processing where deep pagination is required.

Search After – best for real‑time, high‑concurrency user queries that need deep pagination without sacrificing freshness.

In practice, keep result windows under 10,000 to maintain performance, and prefer Scroll or Search After for deeper paging. Adjust index.max_result_window only as a last resort.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Developmentpaginationsearch_afterscroll API
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.