Elasticsearch Optimization Practices for Large-Scale Data Queries
This article explains how to optimize Elasticsearch for cross‑month and multi‑year queries on billions of records, covering Lucene fundamentals, index and search performance tweaks, configuration settings, and practical testing results to achieve sub‑second response times.
1. Introduction
The data platform has evolved through three versions, encountering many common challenges; this article shares refined documentation focusing on Elasticsearch (ES) optimization, while referring readers to other sources for HBase and Hadoop design improvements.
2. Requirement Description
Project Background
In a business system, certain tables generate over a hundred million rows per day; tables are partitioned by day, but queries are limited to daily granularity and the database can retain only three months of data due to hardware constraints, making sharding costly.
Improvement Goals
Enable cross‑month queries and support more than one year of historical data export.
Return query results within seconds based on conditions.
3. Elasticsearch Retrieval Principles
3.1 Basic Structure of ES and Lucene
Understanding component fundamentals helps locate bottlenecks. The diagram below shows the ES architecture.
Key concepts:
Cluster: a group of Nodes.
Node: a service unit within the cluster.
Index: logical namespace containing one or more physical shards.
Type: classification within an index (single type after ES 6.x).
Document: the smallest indexable JSON unit.
Shard: a Lucene instance holding a subset of data.
Replica: copy of a shard for safety and load sharing.
Lucene powers ES; its index consists of dictionaries, inverted lists, forward files, and DocValues.
A Lucene index contains multiple segments; each segment holds documents, each document contains fields that are tokenized into terms.
3.2 Lucene Index Implementation
Lucene index files are divided into dictionary, inverted table, forward file, and DocValues (see diagrams).
Note: Information sourced from the Lucene official documentation.
Random disk reads in Lucene are costly; .fdt files consume space, .tim and .doc benefit from SSDs. Scoring processes also add overhead and can be disabled when not needed.
About DocValues
Inverted indexes quickly locate document IDs, but sorting, grouping, or aggregating requires retrieving field values; DocValues provide column‑store structures for fast look‑ups. Disabling unnecessary DocValues reduces resource consumption.
3.3 ES Index and Search Sharding
An ES index consists of one or more Lucene indexes, each made of segments; the smallest searchable unit is a segment.
Data placement on a shard follows:
shard = hash(routing) % number_of_primary_shards
By default, routing uses the document ID (MurmurHash3). Supplying a consistent _routing value groups related data on the same shard, improving performance.
4. Optimization Cases
In our case, queries are limited to fixed fields (no full‑text search), enabling sub‑second responses on billions of rows.
ES stores only the HBase RowKey, not the full data.
Actual data resides in HBase and is retrieved via RowKey.
Refer to the official ES tuning guide for additional indexing performance tips.
Key optimization items include:
4.1 Index Performance Optimizations
Bulk writes (hundreds to thousands of records per batch).
Multi‑threaded writes, matching thread count to machine count; monitor via Kibana.
Increase segment refresh interval (e.g., set "refresh_interval": "-1" and manually refresh after bulk load).
Allocate ~50% of system memory to Lucene file cache; use nodes with ≥64 GB RAM.
Prefer SSDs over HDD RAID for random I/O.
Use custom IDs aligned with HBase RowKey to enable efficient updates/deletes.
Adjust merge throttling and thread counts based on disk type (e.g., indices.store.throttle.max_bytes_per_sec and index.merge.scheduler.max_thread_count).
4.2 Search Performance Optimizations
Disable DocValues for fields that do not require sorting or aggregation.
Prefer keyword type over numeric types for term queries.
Disable _source storage for fields not needed in query results.
Use filters or constant_score queries to eliminate scoring overhead.
Pagination strategies:
From+size: limited by index.max_result_window (default 10 000). search_after for deep pagination. scroll for large result sets (requires maintaining scroll_id).
Introduce a combined long field (timestamp + ID) for efficient sorting.
Allocate ≥16 CPU cores for sorting‑heavy workloads.
Force deletion of marked records during merge by setting "merge.policy.expunge_deletes_allowed": "0".
{
"mappings": {
"data": {
"dynamic": "false",
"_source": {
"includes": ["XXX"] -- only store needed fields in _source
},
"properties": {
"state": {
"type": "keyword",
"doc_values": false -- disable unnecessary doc values
},
"b": {
"type": "long" -- use long/int for range queries
}
}
}
},
"settings": {......}
}5. Performance Testing
Benchmark before changes to gauge improvements:
Single‑node test with 50 M–100 M records to assess point‑load capacity.
Cluster test with 100 M–3 B records to evaluate disk I/O, memory, CPU, and network usage.
Random query combinations across data volumes to measure response times.
Compare SSD vs. HDD performance.
Extensive testing is essential; without it, performance regressions are hard to detect.
6. Production Results
The platform now runs stably; queries over billions of rows return 100 records within 3 seconds, and pagination is fast. Future bottlenecks can be addressed by scaling nodes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
