Big Data 13 min read

How to Achieve Sub‑Second Queries on Billions of Records with Elasticsearch

This article explains how a data platform handling billions of daily records can be optimized for cross‑month queries and sub‑second response times by tuning Elasticsearch indexing, shard routing, Lucene structures, and hardware configurations.

ITPUB
ITPUB
ITPUB
How to Achieve Sub‑Second Queries on Billions of Records with Elasticsearch

1. Introduction

The data platform has evolved through three versions and initially faced common challenges; the author now shares a refined documentation focusing on Elasticsearch (ES) optimization, while HBase and Hadoop design optimizations are referenced elsewhere.

2. Requirements

Project background: A business system stores over a hundred million rows per day in sharded tables, retains only three months of data, and incurs high costs for additional shards.

Improvement goals:

Enable cross‑month queries and support over a year of historical data export.

Achieve second‑level query response times for conditional searches.

3. Elasticsearch Retrieval Principles

3.1 Basic ES and Lucene Architecture

Understanding component fundamentals helps locate bottlenecks. The ES cluster consists of clusters, nodes, indices, types, documents, shards, and replicas, as illustrated below.

Cluster – a group of nodes. Node – a single ES service instance. Index – logical namespace containing one or more physical shards. Type – classification within an index (single type after ES 6.x). Document – basic indexable unit, e.g., a JSON object. Shards – low‑level work units, each a Lucene instance. Replicas – shard copies for fault tolerance and query load distribution.

ES relies on Lucene; optimizing data structures essentially means optimizing Lucene. Lucene stores data in segments, each containing multiple documents and fields, which are tokenized into terms.

Key Lucene files include dictionaries, inverted lists, forward files, and DocValues. Random disk reads in Lucene are costly; SSDs are recommended for .tim and .doc files, while .fdt files consume significant space.

3.2 Lucene Index Implementation

Lucene’s index files consist of dictionaries, inverted tables, forward files, and DocValues, as shown in the diagrams.

http://lucene.apache.org/core/7_2_1/core/org/apache/lucene/codecs/lucene70/package-summary.html#package.description

DocValues provide column‑store access for fast sorting and aggregation. ES enables DocValues by default for all fields except analyzed strings; disabling unnecessary DocValues reduces resource usage.

3.3 ES Index and Search Sharding

ES indices are composed of one or more Lucene indices, each containing multiple segments. Data placement follows the formula: shard = hash(routing) % number_of_primary_shards By default, the routing parameter is the document ID (MurmurHash3). Supplying a consistent _routing value groups related data on the same shard, reducing search overhead.

4. Optimization Cases

In the presented case, queries are limited to fixed fields (no full‑text search), enabling sub‑second responses for billions of records.

ES stores only HBase row keys, not full data.

Actual data resides in HBase and is retrieved via row‑key lookups.

Indexing and search performance improvements follow official ES guidelines.

4.1 Indexing Performance

Batch writes – each batch contains hundreds to thousands of records.

Multi‑threaded writes – thread count matches the number of machines; monitor performance via Kibana.

Increase segment refresh interval – set "refresh_interval": "-1" and manually refresh after bulk ingestion.

Memory allocation – allocate ~50% of system memory to Lucene file cache; nodes often require >64 GB RAM.

SSD storage – SSDs outperform RAID5/10 for random I/O.

Use auto‑generated IDs – custom keys (e.g., HBase row keys) are acceptable when update/delete operations are needed.

Segment merging – limit merge throughput (default 20 MB/s) and configure thread count:

Math.max(1, Math.min(4, Runtime.getRuntime().availableProcessors() / 2))

. For SSDs, six merge threads were used.

4.2 Search Performance

Disable DocValues for fields that do not require sorting or aggregation.

Prefer keyword over numeric types for term queries.

Disable _source storage for fields not needed in query results.

Turn off scoring when unnecessary; use filter or constant_score queries.

Pagination strategies: from + size – limited by index.max_result_window (default 10 000). search_after – uses the last hit of the previous page. scroll – for large result sets, but requires maintaining a scroll_id.

Combine time and ID into a single long field to improve sorting performance.

Allocate CPUs with ≥16 cores for sorting‑intensive workloads.

Set "merge.policy.expunge_deletes_allowed": "0" to force deletion of marked records during merges.

5. Performance Testing

Baseline benchmarks are essential to measure improvements. Tests conducted include:

Single‑node load of 50 M–100 M records to assess point‑node capacity.

Cluster tests with 1 B–3 B records to evaluate disk I/O, memory, CPU, and network consumption.

Random query combinations across data volumes.

SSD vs. HDD performance comparison.

Comprehensive testing helps identify bottlenecks and validates optimization effectiveness.

6. Production Results

After applying the optimizations, the platform runs stably, handling tens of billions of records with 100‑record queries returning within 3 seconds, and pagination remains fast. Future bottlenecks can be addressed by scaling nodes.

Source: https://www.cnblogs.com/mikevictor07/p/10006553.html
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Dataindexingperformance tuninglucenesearch optimization
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.