Elasticsearch Indexing and Retrieval Optimization for Billion‑Scale Data
This article describes how a top architect optimized Elasticsearch for handling billions of records, covering Lucene fundamentals, index and shard design, DocValues, query performance tuning, bulk indexing strategies, hardware considerations, and testing methods to achieve sub‑second query responses across multi‑year data ranges.
Project Background
In a business system, some tables generate over a hundred million rows per day. Data is partitioned by day, but only three months of data can be retained due to hardware limits, making cross‑month queries and long‑term data access challenging.
Improvement Goals
Enable cross‑month queries and support more than one year of historical data export.
Achieve second‑level response times for conditional queries.
Elasticsearch Retrieval Principles
3.1 Elasticsearch and Lucene Architecture
Understanding the underlying components is essential for optimization. An Elasticsearch index consists of multiple Lucene shards, each shard being a Lucene instance that stores a subset of the data.
Key concepts include:
Cluster – a group of nodes.
Node – a single server in the cluster.
Index – a logical namespace for one or more physical shards.
Shard – the smallest searchable unit, implemented as a Lucene segment.
DocValues – column‑oriented storage for fast sorting, faceting, and aggregations.
3.2 Lucene Index Implementation
Lucene stores data in segments, each containing multiple documents and fields. The index files consist of dictionaries, inverted lists, forward files, and DocValues. Optimizing these structures directly improves Elasticsearch performance.
3.3 Elasticsearch Sharding
Data is routed to a primary shard using shard = hash(routing) % number_of_primary_shards . By controlling the _routing parameter, related documents can be co‑located on the same shard, reducing query overhead.
Optimization Cases
4.1 Index Performance
Batch writes with appropriate document size (hundreds to thousands of records).
Multi‑threaded ingestion, scaling thread count with the number of machines.
Increase refresh_interval (e.g., set to -1 ) and manually refresh after bulk loading.
Allocate ~50% of system memory to Lucene file cache; use nodes with 64 GB+ RAM.
Prefer SSDs over HDDs for random I/O.
Use custom keys (e.g., HBase row keys) instead of auto‑generated IDs when appropriate.
Configure merge throttling and thread counts based on storage type (e.g., higher threads for SSD).
4.2 Search Performance
Disable DocValues for fields that do not require sorting or aggregations.
Prefer keyword fields over numeric types for term queries.
Disable _source storage for fields not needed in search results.
Use filters or constant_score queries to avoid scoring overhead.
Pagination strategies: Standard from + size – limited by index.max_result_window (default 10 000). search_after – efficient deep pagination using the last hit as a cursor. scroll – for large result sets, with the trade‑off of maintaining a scroll ID.
Introduce a combined long field (timestamp + ID) to support fast sorting.
Allocate CPUs with ≥16 cores for sorting‑intensive workloads.
Set merge.policy.expunge_deletes_allowed to 0 to force immediate deletion of marked records.
Performance Testing
Single‑node tests with 50 M–100 M documents to assess node capacity.
Cluster tests ranging from 100 M to 3 B documents, monitoring disk I/O, memory, CPU, and network usage.
Benchmark various query combinations across data volumes.
Compare SSD vs. HDD performance.
Production Results
The platform now handles tens of billions of records, returning 100 rows within 3 seconds, with fast pagination. Future bottlenecks can be addressed by scaling out additional nodes.
Configuration Example
{
"mappings": {
"data": {
"dynamic": "false",
"_source": {
"includes": ["XXX"]
},
"properties": {
"state": {
"type": "keyword",
"doc_values": false
},
"b": {
"type": "long"
}
}
}
},
"settings": {......}
}Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.