Databases 8 min read

14 Essential Elasticsearch Best Practices for High‑Performance, Scalable Search

This guide presents fourteen practical Elasticsearch best‑practice recommendations—including index design, field‑type choices, query tuning, cluster layout, and architecture patterns—plus four detailed business‑scenario solutions for logging, e‑commerce, monitoring, and user‑behavior analytics.

Ray's Galactic Tech
Ray's Galactic Tech
Ray's Galactic Tech
14 Essential Elasticsearch Best Practices for High‑Performance, Scalable Search

1. Index Design

Use time‑based rolling indices (daily or monthly) for logs, monitoring data, user events, or order records. Create an index template that defines three primary shards, one replica, a 30‑second refresh interval, strict dynamic mapping, and appropriate field mappings for timestamps, levels, and messages.

PUT _index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "refresh_interval": "30s"
    },
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "timestamp": {"type": "date"},
        "level": {"type": "keyword"},
        "message": {"type": "text"}
      }
    }
  }
}

PUT logs-2025.02.15

Recommended shard size: 10–50 GB (ideal ~30 GB) and total shard count ≈ number of data nodes × 1.5‑3. Check health with GET _cat/shards?v.

2. Field‑Type Design

Choose keyword for exact‑match fields (e.g., user IDs, order numbers, city codes) and text with an analyzer (e.g., ik_max_word) for full‑text search. Disable indexing on fields that never need search, or exclude them via _source excludes.

{
  "username": {"type": "keyword"},
  "description": {"type": "text", "analyzer": "ik_max_word"},
  "userId": {"type": "keyword"},
  "raw_data": {"type": "keyword", "index": false},
  "_source": {"excludes": ["payload.raw"]}
}

3. Query Performance

Route documents to a specific shard to reduce query fan‑out ( routing=user_123), achieving 3‑10× speed gains. Use search_after for deep pagination, and define index.sort on a hot field (e.g., createdAt) to accelerate sorted queries.

POST order/_doc?routing=user_123
{
  "userId": "user_123",
  "orderId": "A001"
}

GET order/_search?routing=user_123

POST product/_search
{
  "size": 10,
  "search_after": [1685600000, "product_899"],
  "sort": [{"createdAt": "desc"}, {"productId": "asc"}]
}

PUT products
{
  "settings": {
    "index.sort.field": ["createdAt"],
    "index.sort.order": ["desc"]
  }
}

Avoid high‑cardinality aggregations on raw fields; prefer composite aggregations.

{
  "composite": {
    "size": 1000,
    "sources": [{"user": {"terms": {"field": "username"}}}]
  }
}

4. Cluster Management

Separate master, data, and ingest nodes using node.roles: [master] or node.roles: [data]. Allocate JVM heap (e.g., -Xms16g -Xmx16g) but stay below 31 GB to stay in compressed oops. Monitor circuit‑breaker stats via GET _nodes/stats/breaker. Tune write performance by extending refresh_interval (e.g., 30s) and setting index.translog.durability to async. Bulk load data with the _bulk API.

curl -XPOST localhost:9200/_bulk -H "Content-Type: application/json" -d '
{ "index": { "_index": "logs" } }
{ "timestamp": "2025-02-14T10:00:00", "level": "INFO" }
...'

5. Architecture Strategy

Adopt a hot‑warm‑cold tiered architecture with ILM policies to rollover indices after 7 days or 30 GB, move warm shards to SATA, cold shards to HDD, and delete after 180 days.

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": {"actions": {"rollover": {"max_age": "7d", "max_size": "30GB"}}},
      "warm": {"actions": {"allocate": {"include": {"box_type": "warm"}}}},
      "cold": {"actions": {"allocate": {"include": {"box_type": "cold"}}}},
      "delete": {"min_age": "180d", "actions": {"delete": {}}}
    }
  }
}

6. Runtime Fields (Bonus)

Define on‑the‑fly fields, e.g., a fullName keyword built from first and last fields. Not suitable for high‑QPS workloads.

{
  "runtime": {
    "fullName": {
      "type": "keyword",
      "script": "emit(doc['first'].value + ' ' + doc['last'].value)"
    }
  }
}

7. Business‑Scenario Optimizations

Scenario 1 – Log System (ELK/EFK)

Daily index naming: logs-YYYY.MM.DD Refresh interval = 30 s

Translog durability = async

Hot‑warm‑cold tiering (SSD / SATA / HDD)

{
  "query": {"range": {"@timestamp": {"gte": "now-1h/h"}}},
  "sort": [{"@timestamp": "desc"}]
}

Scenario 2 – E‑commerce Search

Field mapping: name (text, ik_max_word), brand (keyword), tags (keyword), price (integer), rating (float)

{
  "query": {
    "bool": {
      "must": [{"match": {"name": "手机 大电池"}}],
      "filter": [
        {"term": {"brand": "HUAWEI"}},
        {"range": {"price": {"lte": 3000}}}
      ]
    }
  },
  "sort": [{"rating": "desc"}]
}

Scenario 3 – Monitoring Metrics

Set index.sort = @timestamp Disable indexing on unused fields

Use routing on

hostId
{
  "query": {
    "bool": {
      "filter": [
        {"term": {"hostId": "server-1"}},
        {"range": {"@timestamp": {"gte": "now-10m"}}}
      ]
    }
  },
  "sort": [{"@timestamp": "desc"}]
}

Scenario 4 – User‑Behavior Analytics

Mark high‑cardinality keyword fields with fielddata=false for aggregations.

Use date_histogram + composite for large‑scale aggregations.

Leverage rollup jobs for historical down‑sampling.

{
  "size": 0,
  "aggs": {
    "daily": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "day"
      }
    }
  }
}

Final Summary

The article delivers fourteen core Elasticsearch best practices covering index design, field mapping, write and query tuning, and architectural patterns, plus concrete configurations for large‑scale systems and four specialized optimization blueprints for logging, e‑commerce, monitoring, and user‑behavior analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

architectureElasticsearchindex designquery optimizationbest practicesSearch
Ray's Galactic Tech
Written by

Ray's Galactic Tech

Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.