Mastering Elasticsearch Pagination: From/Size, Scroll, and Search After Explained
This article examines Elasticsearch's deep pagination challenges and presents three practical solutions—basic from/size, scroll API, and search after—detailing their execution phases, performance trade‑offs, code examples, and guidance on when to choose each method for efficient data retrieval.
Elasticsearch is a real‑time distributed search and analytics engine that excels with large volumes of unstructured data, but it suffers from deep pagination issues similar to relational databases.
From + Size Pagination
The simplest pagination method mirrors SQL's LIMIT, using from (start offset) and size (page size). Example DSL:
GET /wms_order_sku/_search
{
"query": {"match_all": {}},
"from": 10,
"size": 20
}Elasticsearch processes a search in two phases:
Query phase – the coordinating node creates a priority queue of from+size entries, broadcasts the request to relevant shards, each shard fills its own queue, and the coordinating node merges them.
Fetch phase – the coordinating node retrieves the full documents for the selected IDs.
Scroll Pagination
Scroll works like a database cursor: the first request creates a snapshot and returns a scroll_id. Subsequent requests use this ID to fetch the next batch, reducing repeated sorting and query overhead.
Execution steps are similar to the from/size query, but the snapshot of IDs is kept on the coordinating node.
GET /wms_order_sku2021_10/_search?scroll=1m
{
"query": {"bool": {"must": [{"range": {"shipmentOrderCreateTime": {"gte": "2021-10-04 00:00:00", "lt": "2021-10-15 00:00:00"}}}]},
"size": 20
} GET /_search/scroll
{
"scroll": "1m",
"scroll_id": "DnF1ZXJ5VGhlbkZldGNo..."
}Search After Pagination
Introduced in ES 5, Search After records the sort values of the last hit and uses them as a cursor for the next request, allowing real‑time data changes to be reflected while avoiding the snapshot overhead of scroll.
It does not support arbitrary page jumps; each request fetches the next page based on the previous page's last document.
GET /wms_order_sku2021_10/_search
{
"query": {"bool": {"must": [{"range": {"shipmentOrderCreateTime": {"gte": "2021-10-12 00:00:00", "lt": "2021-10-15 00:00:00"}}}]},
"size": 20,
"sort": [{"_id": {"order": "desc"}}, {"shipmentOrderCreateTime": {"order": "desc"}}],
"search_after": ["SO-460_152-1447931043809128448-100017918838", 1634077436000]
}Comparison and Recommendations
When to use each method:
From/Size – suitable for small result windows (<10,000 records) or when only the top‑N results are needed.
Scroll – ideal for large data migrations or batch processing where deep pagination is required.
Search After – best for real‑time, high‑concurrency user queries that need deep pagination without sacrificing freshness.
In practice, keep result windows under 10,000 to maintain performance, and prefer Scroll or Search After for deeper paging. Adjust index.max_result_window only as a last resort.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
