Big Data 12 min read

Why Does Elasticsearch Aggregate Faster with Fewer Terms? Uncover the Secrets

This article examines a real‑world Elasticsearch cluster handling hundreds of terabytes, explains why high‑cardinality aggregations can be slower, and shows how setting execution_hint=map and tuning doc_values dramatically improves aggregation performance for ultra‑high‑concurrency workloads.

dbaplus Community

Jan 26, 2022

Why Does Elasticsearch Aggregate Faster with Fewer Terms? Uncover the Secrets

1. Case Description

The author describes an Elasticsearch cluster deployed on dozens of physical machines, each running multiple ES instances. The cluster stores several hundred terabytes of data across thousands of indices, each with a single shard sized 50‑70 GB, which is within Elasticsearch’s recommended limits. Queries often target up to ten indices simultaneously, filtering a massive dataset down to only a few dozen matching documents before performing nested bucket aggregations. The workload demands tens of millions of aggregation queries per day with very high concurrency, yet the observed performance is erratic: sometimes aggregations on smaller result sets run slower than those on larger ones.

2. Optimization Settings

After detailed investigation, the key optimization was adding a single bucket‑aggregation attribute: execution_hint=map. This forces Elasticsearch to use a map‑based aggregation strategy that avoids the overhead of global ordinals for low‑cardinality terms.

{
    "aggs": {
        "tags": {
            "terms": {
                "field": "tags",
                "execution_hint": "map"
            }
        }
    }
}

Additionally, the author highlights the importance of doc_values (column‑store) settings. By default, doc_values are enabled, but they can be disabled for fields that are never used in aggregations to save resources.

{
    "mappings": {
        "properties": {
            "status_code": {
                "type": "keyword",
                "doc_values": true
            },
            "session_id": {
                "type": "keyword",
                "doc_values": false
            }
        }
    }
}

3. ES Technical Principles

Global ordinals are memory‑mapped structures that map terms to numeric IDs to accelerate high‑cardinality aggregations. While beneficial for large term sets, they introduce extra conversion overhead for low‑cardinality data, causing the paradoxical slowdown observed in the case study. Using execution_hint=map bypasses global ordinals for such scenarios.

Doc values provide a column‑oriented storage layer that excels at aggregations and sorting. They are automatically enabled for keyword fields, but can be turned off when not needed, reducing memory consumption.

Cross‑index search leverages Elasticsearch’s shard architecture: each index consists of one or more shards, each a self‑contained Lucene segment. Queries spanning multiple indices are internally routed to the relevant shards, offering the flexibility of multi‑index aggregation without the complexity of traditional database sharding.

4. OLTP vs. OLAP

The author contrasts transactional (OLTP) and analytical (OLAP) workloads. Although the case involves massive data volumes, the actual filtered result set per query is tiny, making a pure OLAP engine unnecessary. Instead, Elasticsearch serves as a hybrid solution that satisfies both high‑throughput transactional queries and real‑time analytical aggregations.

Performance‑comparison articles that pit Elasticsearch against column‑store databases (e.g., ClickHouse, Doris) often overlook the need for real‑time writes, complex filters, and ultra‑high concurrency. The author argues that for the described scenario, Elasticsearch remains the most suitable choice.

Conclusion

By adjusting aggregation hints and understanding the role of global ordinals, doc values, and shard routing, the Elasticsearch cluster achieved the expected high‑concurrency performance. The discussion also emphasizes that choosing the right technology stack depends on the precise mix of TP and AP requirements rather than raw data size alone.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Big Data Search Engine Elasticsearch data analytics Aggregation

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.