Databases 22 min read

Elasticsearch Best Practices: Query, Index, and Performance Optimizations

The guide outlines production‑ready Elasticsearch best practices, covering query tuning such as using shard request cache, filter context, size‑0 aggregations and composite aggregations; write strategies like auto‑generated IDs, bulk API sizing and refresh handling; optimal shard counts, explicit mappings with disabled unnecessary features, and general advice to use explicit index names and stored scripts.

DeWu Technology
DeWu Technology
DeWu Technology
Elasticsearch Best Practices: Query, Index, and Performance Optimizations

This article shares practical suggestions for using Elasticsearch in production, explaining the rationale behind each recommendation.

Query-related optimizations

1. Leverage shard request cache for aggregations; only cached when size=0 , not for scroll, profiling, etc. Cache is invalidated after a segment refresh.

2. Use filter context instead of query context to enable caching and avoid scoring.

BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
boolQuery.filter(QueryBuilders.termQuery("field", "value"));

3. Set size=0 when only aggregation results are needed.

SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.aggregation(...);
sourceBuilder.size(0);

4. Prefer absolute time values over now in range queries to allow cache reuse.

5. Avoid deep nested aggregations; use composite aggregation for multi‑dimensional group‑by.

CompositeAggregationBuilder compositeAggregationBuilder = AggregationBuilders.composite(...);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder()
    .query(QueryBuilders.matchAllQuery())
    .aggregation(compositeAggregationBuilder)
    .size(0);

6. Do not use bucket_sort for deep pagination; prefer composite aggregation or PIT + search_after.

7. Use _doc sort for scroll when possible, and always clear scroll contexts.

Write‑related recommendations

• Let Elasticsearch generate document IDs instead of specifying them.

• Use Bulk API for large writes, tuning batch size (5‑15 MB) and refresh interval.

• Avoid manual refreshes and setting replica count to zero during bulk load.

Index creation and mapping design

• Choose appropriate shard count (usually 1‑2 replicas per primary) and keep shard size below 30‑50 GB.

• Define explicit mappings; avoid dynamic mapping, use keyword for non‑analyzed fields, numeric for range queries.

• Disable norms, doc_values, and fielddata for fields that are not used in scoring, sorting or aggregations.

• Consider eager global ordinals for high‑cardinality keyword fields used in aggregations.

PUT index
{
  "mappings": {
    "properties": {
      "foo": {"type": "keyword", "eager_global_ordinals": true}
    }
  }
}

General advice

• Avoid querying all indices with wildcards, use explicit index names.

• Prefer stored scripts over inline scripts.

• Do not use the deprecated _all field.

The article concludes with a summary of the most important practices and references.

PerformanceIndexingElasticsearchCachingQuery Optimization
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.