How to Supercharge Elasticsearch: Practical Index & Search Optimizations

This article shares practical lessons from three iterations of a data platform, focusing on Elasticsearch and Lucene optimizations that enable cross‑month queries, year‑long data export, and sub‑second query responses for tables handling billions of rows per day.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
How to Supercharge Elasticsearch: Practical Index & Search Optimizations

Introduction

The data platform has gone through three versions, encountering many common challenges. This article documents the lessons learned, especially around Elasticsearch (ES) optimization, while referring to other resources for HBase and Hadoop design.

Project Background

In a business system, some tables store over a hundred million rows daily. Data is partitioned by day, but queries often need to span months, and the database can only retain three months of data due to hardware limits, making sharding costly.

Improvement Goals

Enable cross‑month queries and support exporting more than one year of historical data.

Achieve second‑level response times for conditional queries.

Deep Principles

3.1 ES and Lucene Basic Architecture

Understanding the core components helps locate bottlenecks. Key ES concepts include:

Cluster: a group of nodes.

Node: a single ES service instance.

Index: a logical namespace containing one or more physical shards.

Type: a classification within an index (single type after ES 6.x).

Document: the smallest indexable unit, typically a JSON object.

Shard: a low‑level work unit that stores a subset of data; each shard is a Lucene instance.

Replica: a copy of a shard for fault tolerance and load balancing.

ES relies heavily on Lucene, whose index consists of multiple segments , each containing many documents and fields. Lucene’s file structure includes a dictionary, inverted list, forward file, and DocValues.

3.2 Lucene Index Implementation

Lucene stores data in files such as .tim, .doc, .fdt, etc. Random disk reads are costly, especially for .fdt. Scoring also consumes resources and can be disabled when not needed.

3.3 ES Index and Search Sharding

ES routes documents to shards using hash(routing) % number_of_primary_shards. By default the routing key is the document ID (MurmurHash3). Supplying a custom _routing value can co‑locate related data on the same shard, reducing search overhead.

DocValues Overview

DocValues provide a column‑store structure that enables fast sorting, faceting, and aggregation by mapping doc_id → field_value. ES enables DocValues for all fields except analyzed strings. Disabling unnecessary DocValues saves memory and CPU.

Optimization Cases

In our scenario, ES only indexes fields and stores HBase rowkeys; the actual data resides in HBase. Queries retrieve data via rowkey lookups.

4.1 Index Performance Optimizations

Bulk write with batch sizes of a few hundred to a few thousand records.

Multi‑threaded writes matching the number of machines.

Increase refresh_interval (e.g., set to -1) and manually refresh after bulk loads.

Allocate ~50% of system memory to Lucene file cache; use nodes with 64 GB+ RAM.

Prefer SSDs over HDDs for random I/O.

Use custom IDs aligned with HBase rowkeys to simplify updates and deletions.

Tune segment merging: limit merge throughput, adjust thread count based on disk type, and set merge.policy.expunge_deletes_allowed to 0 for aggressive cleanup.

4.2 Search Performance Optimizations

Disable DocValues for fields that are not sorted or aggregated.

Prefer keyword fields over numeric types for term queries.

Turn off unnecessary _source storage.

Use filters or constant_score queries to avoid scoring when not required.

Pagination strategies: from + size – limited by index.max_result_window (default 10 000). search_after – use the last hit of the previous page. scroll – for deep scrolling of large result sets.

Store combined time‑ID fields as long for efficient sorting.

Allocate CPUs with 16 cores or more for heavy sorting workloads.

Set merge.policy.expunge_deletes_allowed to 0 to force deletion of marked records during merges.

{
    "mappings": {
        "data": {
            "dynamic": "false",
            "_source": {"includes": ["XXX"]},
            "properties": {
                "state": {"type": "keyword", "doc_values": false},
                "b": {"type": "long"}
            }
        }
    },
    "settings": {......}
}

Performance Testing

Single‑node test with 50 M–100 M records to assess point‑node capacity.

Cluster test with 1 B–3 B records to evaluate disk I/O, memory, CPU, and network usage.

Random condition queries across various data volumes.

Compare SSD vs. HDD performance.

Production Results

The platform now runs stably, handling tens of billions of records with 100‑row queries returning in under 3 seconds and fast pagination. Future bottlenecks can be addressed by scaling nodes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationindexingElasticsearchluceneSearch
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.