How to Supercharge Elasticsearch: Practical Index & Search Optimizations
This article shares practical lessons from three iterations of a data platform, focusing on Elasticsearch and Lucene optimizations that enable cross‑month queries, year‑long data export, and sub‑second query responses for tables handling billions of rows per day.
Introduction
The data platform has gone through three versions, encountering many common challenges. This article documents the lessons learned, especially around Elasticsearch (ES) optimization, while referring to other resources for HBase and Hadoop design.
Project Background
In a business system, some tables store over a hundred million rows daily. Data is partitioned by day, but queries often need to span months, and the database can only retain three months of data due to hardware limits, making sharding costly.
Improvement Goals
Enable cross‑month queries and support exporting more than one year of historical data.
Achieve second‑level response times for conditional queries.
Deep Principles
3.1 ES and Lucene Basic Architecture
Understanding the core components helps locate bottlenecks. Key ES concepts include:
Cluster: a group of nodes.
Node: a single ES service instance.
Index: a logical namespace containing one or more physical shards.
Type: a classification within an index (single type after ES 6.x).
Document: the smallest indexable unit, typically a JSON object.
Shard: a low‑level work unit that stores a subset of data; each shard is a Lucene instance.
Replica: a copy of a shard for fault tolerance and load balancing.
ES relies heavily on Lucene, whose index consists of multiple segments , each containing many documents and fields. Lucene’s file structure includes a dictionary, inverted list, forward file, and DocValues.
3.2 Lucene Index Implementation
Lucene stores data in files such as .tim, .doc, .fdt, etc. Random disk reads are costly, especially for .fdt. Scoring also consumes resources and can be disabled when not needed.
3.3 ES Index and Search Sharding
ES routes documents to shards using hash(routing) % number_of_primary_shards. By default the routing key is the document ID (MurmurHash3). Supplying a custom _routing value can co‑locate related data on the same shard, reducing search overhead.
DocValues Overview
DocValues provide a column‑store structure that enables fast sorting, faceting, and aggregation by mapping doc_id → field_value. ES enables DocValues for all fields except analyzed strings. Disabling unnecessary DocValues saves memory and CPU.
Optimization Cases
In our scenario, ES only indexes fields and stores HBase rowkeys; the actual data resides in HBase. Queries retrieve data via rowkey lookups.
4.1 Index Performance Optimizations
Bulk write with batch sizes of a few hundred to a few thousand records.
Multi‑threaded writes matching the number of machines.
Increase refresh_interval (e.g., set to -1) and manually refresh after bulk loads.
Allocate ~50% of system memory to Lucene file cache; use nodes with 64 GB+ RAM.
Prefer SSDs over HDDs for random I/O.
Use custom IDs aligned with HBase rowkeys to simplify updates and deletions.
Tune segment merging: limit merge throughput, adjust thread count based on disk type, and set merge.policy.expunge_deletes_allowed to 0 for aggressive cleanup.
4.2 Search Performance Optimizations
Disable DocValues for fields that are not sorted or aggregated.
Prefer keyword fields over numeric types for term queries.
Turn off unnecessary _source storage.
Use filters or constant_score queries to avoid scoring when not required.
Pagination strategies: from + size – limited by index.max_result_window (default 10 000). search_after – use the last hit of the previous page. scroll – for deep scrolling of large result sets.
Store combined time‑ID fields as long for efficient sorting.
Allocate CPUs with 16 cores or more for heavy sorting workloads.
Set merge.policy.expunge_deletes_allowed to 0 to force deletion of marked records during merges.
{
"mappings": {
"data": {
"dynamic": "false",
"_source": {"includes": ["XXX"]},
"properties": {
"state": {"type": "keyword", "doc_values": false},
"b": {"type": "long"}
}
}
},
"settings": {......}
}Performance Testing
Single‑node test with 50 M–100 M records to assess point‑node capacity.
Cluster test with 1 B–3 B records to evaluate disk I/O, memory, CPU, and network usage.
Random condition queries across various data volumes.
Compare SSD vs. HDD performance.
Production Results
The platform now runs stably, handling tens of billions of records with 100‑row queries returning in under 3 seconds and fast pagination. Future bottlenecks can be addressed by scaling nodes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Interview Crash Guide
Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
