Big Data 11 min read

How to Supercharge Elasticsearch Queries on Billions of Records

This article explains why Elasticsearch can be slow on massive datasets, then details practical techniques—leveraging filesystem cache, pre‑heating hot data, separating hot and cold indices, designing lean document models, and avoiding deep pagination—to achieve sub‑second query performance at billions‑scale.

dbaplus Community

May 21, 2019

How to Supercharge Elasticsearch Queries on Billions of Records

Interviewer Psychology

Interviewers often ask whether you can improve Elasticsearch query speed when the data reaches billions of records, mainly to test real‑world experience with ES.

Question Analysis

Elasticsearch performance is not magically fast; with hundreds of millions of documents the first query can take 5‑10 seconds, then drop to a few hundred milliseconds after caches warm up.

1. Filesystem Cache – The Key Lever

All data written to Elasticsearch resides on disk. The operating system automatically caches file data in the filesystem cache, so if the cache can hold the index segments, queries run almost entirely in memory and become extremely fast.

When queries hit the disk they take seconds; when they hit the cache they drop to milliseconds, often an order of magnitude faster.

Real‑world case: a three‑node cluster with 64 GB RAM per node allocated 32 GB JVM heap leaves only 32 GB for filesystem cache (96 GB total). The index size was 1 TB, so only ~10 % could be cached, causing most queries to hit disk and be slow. The rule of thumb is to have cache memory at least half of the total data size.

Best practice: store only the fields needed for search in Elasticsearch (e.g., id, name, age) and keep the rest in a secondary store such as MySQL or HBase. This reduces the index size, allowing the cache to hold more of the searchable data.

2. Data Preheating

Periodically run background searches on hot data (e.g., popular users on Weibo or best‑selling products on an e‑commerce site) so that the corresponding index segments are loaded into the filesystem cache before real users request them.

3. Hot‑Cold Separation

Create separate indices for hot (frequently accessed) and cold (rarely accessed) data. Deploy hot indices on a subset of nodes and cold indices on the remaining nodes, ensuring hot data stays in cache while cold data does not evict it.

4. Document Model Design

Avoid complex joins, nested objects, or parent‑child relationships in Elasticsearch. Perform necessary joins in the application layer before indexing, and keep the document schema simple so that most work is done at ingest time, not at query time.

5. Pagination Optimization

Deep pagination is expensive because each shard must return the top N records (e.g., 1 000 for page 100) to the coordinating node, which then merges and slices the result. The deeper you paginate, the more data each shard sends, dramatically increasing latency.

Disallow deep pagination in the product design; limit the maximum page depth.

Use the scroll API for endless‑scroll scenarios (e.g., social‑media feeds) where users only move forward.

Alternatively, use search_after with a unique sort field to fetch the next page sequentially.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Elasticsearch data modeling pagination

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.