14 Essential Elasticsearch Best Practices for High‑Performance, Scalable Search
This guide presents fourteen practical Elasticsearch best‑practice recommendations—including index design, field‑type choices, query tuning, cluster layout, and architecture patterns—plus four detailed business‑scenario solutions for logging, e‑commerce, monitoring, and user‑behavior analytics.
1. Index Design
Use time‑based rolling indices (daily or monthly) for logs, monitoring data, user events, or order records. Create an index template that defines three primary shards, one replica, a 30‑second refresh interval, strict dynamic mapping, and appropriate field mappings for timestamps, levels, and messages.
PUT _index_template/logs_template
{
"index_patterns": ["logs-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "30s"
},
"mappings": {
"dynamic": "strict",
"properties": {
"timestamp": {"type": "date"},
"level": {"type": "keyword"},
"message": {"type": "text"}
}
}
}
}
PUT logs-2025.02.15Recommended shard size: 10–50 GB (ideal ~30 GB) and total shard count ≈ number of data nodes × 1.5‑3. Check health with GET _cat/shards?v.
2. Field‑Type Design
Choose keyword for exact‑match fields (e.g., user IDs, order numbers, city codes) and text with an analyzer (e.g., ik_max_word) for full‑text search. Disable indexing on fields that never need search, or exclude them via _source excludes.
{
"username": {"type": "keyword"},
"description": {"type": "text", "analyzer": "ik_max_word"},
"userId": {"type": "keyword"},
"raw_data": {"type": "keyword", "index": false},
"_source": {"excludes": ["payload.raw"]}
}3. Query Performance
Route documents to a specific shard to reduce query fan‑out ( routing=user_123), achieving 3‑10× speed gains. Use search_after for deep pagination, and define index.sort on a hot field (e.g., createdAt) to accelerate sorted queries.
POST order/_doc?routing=user_123
{
"userId": "user_123",
"orderId": "A001"
}
GET order/_search?routing=user_123
POST product/_search
{
"size": 10,
"search_after": [1685600000, "product_899"],
"sort": [{"createdAt": "desc"}, {"productId": "asc"}]
}
PUT products
{
"settings": {
"index.sort.field": ["createdAt"],
"index.sort.order": ["desc"]
}
}Avoid high‑cardinality aggregations on raw fields; prefer composite aggregations.
{
"composite": {
"size": 1000,
"sources": [{"user": {"terms": {"field": "username"}}}]
}
}4. Cluster Management
Separate master, data, and ingest nodes using node.roles: [master] or node.roles: [data]. Allocate JVM heap (e.g., -Xms16g -Xmx16g) but stay below 31 GB to stay in compressed oops. Monitor circuit‑breaker stats via GET _nodes/stats/breaker. Tune write performance by extending refresh_interval (e.g., 30s) and setting index.translog.durability to async. Bulk load data with the _bulk API.
curl -XPOST localhost:9200/_bulk -H "Content-Type: application/json" -d '
{ "index": { "_index": "logs" } }
{ "timestamp": "2025-02-14T10:00:00", "level": "INFO" }
...'5. Architecture Strategy
Adopt a hot‑warm‑cold tiered architecture with ILM policies to rollover indices after 7 days or 30 GB, move warm shards to SATA, cold shards to HDD, and delete after 180 days.
PUT _ilm/policy/logs_policy
{
"policy": {
"phases": {
"hot": {"actions": {"rollover": {"max_age": "7d", "max_size": "30GB"}}},
"warm": {"actions": {"allocate": {"include": {"box_type": "warm"}}}},
"cold": {"actions": {"allocate": {"include": {"box_type": "cold"}}}},
"delete": {"min_age": "180d", "actions": {"delete": {}}}
}
}
}6. Runtime Fields (Bonus)
Define on‑the‑fly fields, e.g., a fullName keyword built from first and last fields. Not suitable for high‑QPS workloads.
{
"runtime": {
"fullName": {
"type": "keyword",
"script": "emit(doc['first'].value + ' ' + doc['last'].value)"
}
}
}7. Business‑Scenario Optimizations
Scenario 1 – Log System (ELK/EFK)
Daily index naming: logs-YYYY.MM.DD Refresh interval = 30 s
Translog durability = async
Hot‑warm‑cold tiering (SSD / SATA / HDD)
{
"query": {"range": {"@timestamp": {"gte": "now-1h/h"}}},
"sort": [{"@timestamp": "desc"}]
}Scenario 2 – E‑commerce Search
Field mapping: name (text, ik_max_word), brand (keyword), tags (keyword), price (integer), rating (float)
{
"query": {
"bool": {
"must": [{"match": {"name": "手机 大电池"}}],
"filter": [
{"term": {"brand": "HUAWEI"}},
{"range": {"price": {"lte": 3000}}}
]
}
},
"sort": [{"rating": "desc"}]
}Scenario 3 – Monitoring Metrics
Set index.sort = @timestamp Disable indexing on unused fields
Use routing on
hostId {
"query": {
"bool": {
"filter": [
{"term": {"hostId": "server-1"}},
{"range": {"@timestamp": {"gte": "now-10m"}}}
]
}
},
"sort": [{"@timestamp": "desc"}]
}Scenario 4 – User‑Behavior Analytics
Mark high‑cardinality keyword fields with fielddata=false for aggregations.
Use date_histogram + composite for large‑scale aggregations.
Leverage rollup jobs for historical down‑sampling.
{
"size": 0,
"aggs": {
"daily": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "day"
}
}
}
}Final Summary
The article delivers fourteen core Elasticsearch best practices covering index design, field mapping, write and query tuning, and architectural patterns, plus concrete configurations for large‑scale systems and four specialized optimization blueprints for logging, e‑commerce, monitoring, and user‑behavior analysis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ray's Galactic Tech
Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
