Elasticsearch Performance Tuning Guide: Configuration, System, and Usage Optimizations
This article provides a comprehensive guide to improving Elasticsearch performance and stability by covering configuration file tweaks, system‑level settings, and usage‑level optimizations such as hot‑thread analysis, pending tasks, field storage, translog handling, refresh intervals, shard management, and best practices for routing and alias usage.
Because many users complain about Elasticsearch performance and cluster stability, this guide consolidates practical tuning tips based on daily experience, covering configuration, system, and usage aspects.
1. Configuration File Tuning
elasticsearch.yml
Memory Lock
bootstrap.memory_lock: true allows the JVM to lock memory, preventing the OS from swapping it out.
zen.discovery
Elasticsearch defaults to unicast discovery to avoid accidental node joins; multicast should never be used in production. Unicast is configured via discovery.zen.ping.unicast.hosts with all node addresses. Important zen settings include discovery.zen.ping_timeout , discovery.zen.join_timeout , and discovery.zen.minimum_master_nodes .
Fault Detection (fault detection)
Two fault‑detection mechanisms exist: master‑initiated pings to all nodes and node‑initiated pings to the master. Settings include discovery.zen.fd.ping_interval , discovery.zen.fd.ping_timeout , and discovery.zen.fd.ping_retries .
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/modules-discovery-zen.html
Thread Pool Queue Size
Do not blindly increase queue size; monitor GET /_cat/thread_pool for queue and rejected counts and adjust only when sustained blockage is observed.
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/modules-threadpool.html
Memory Usage
Adjust circuit‑breaker limits according to workload: indices.breaker.total.limit (default 70% of JVM heap), indices.breaker.request.limit (default 60%), and indices.breaker.fielddata.limit (default 60%). Also tune query cache size via indices.queries.cache.size (default 10%, often set to 5%).
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/circuit-breaker.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/query-cache.html
Shard Creation
For large clusters, disable full metadata scans when creating new shards with cluster.routing.allocation.disk.include_relocations: false (default true).
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/disk-allocator.html
2. System‑Level Tuning
JDK Version
Use the JDK version recommended by the official Elasticsearch documentation.
JDK Memory Configuration
Set -Xms and -Xmx to the same value; for machines with less than 64 GB RAM, allocate slightly less than half to the JVM. Keep heap size below 32 GB to avoid pointer compression overhead.
https://www.elastic.co/guide/cn/elasticsearch/guide/current/heap-sizing.html
Swap Partition
Disable swapping to prevent performance degradation: swapoff -a .
File Descriptors
Increase the limit (e.g., ulimit -n 65536 ) because both Lucene and Elasticsearch require many descriptors.
https://www.elastic.co/guide/en/elasticsearch/reference/6.5/setting-system-settings.html
mmap Settings
Ensure sufficient virtual memory for mmap files, e.g., sysctl -w vm.max_map_count=262144 or set permanently in /etc/sysctl.conf .
Disk
For SSDs, use an I/O scheduler optimized for flash (e.g., echo noop > /sys/block/sd/queue/scheduler ) and avoid CFQ which is tuned for rotating media.
Disk Mount Options
Mount with noatime,data=writeback,barrier=0,nobh to reduce journaling and buffering overhead.
Disk Best Practices
Prefer RAID 0 for maximum throughput, avoid remote mounts (NFS/SMB), and distribute data across multiple path.data directories.
3. Elasticsearch Usage‑Level Tuning
hot_threads
GET /_nodes/hot_threads&interval=30s captures the most resource‑intensive threads over 30 seconds, helping identify bottlenecks such as bulk, search, or merge threads.
https://www.elastic.co/guide/en/elasticsearch/reference/6.x/cluster-nodes-hot-threads.html
pending_tasks
GET /_cluster/pending_tasks shows tasks waiting for the master node, useful for spotting metadata‑change backlogs.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-pending.html
Field Storage
Choose between doc_values , fielddata , and store based on query needs; disable unnecessary fielddata and source storage when possible.
translog
After ES 2.0, translog durability defaults to request . For higher throughput, set index.translog.durability: "async" and tune index.translog.sync_interval , index.translog.flush_threshold_size , etc.
{
"index.translog.durability": "async"
}refresh_interval
Control how often refreshed segments become visible; default 1 s, can be set to -1 to disable for write‑heavy workloads.
Disable Dynamic Mapping
Dynamic mapping can cause metadata churn and type mismatches; set dynamic: false or strict when schema stability is required.
Bulk Indexing
Recommended bulk size is 5–15 MB of raw bytes; monitor resource usage and adjust concurrency to avoid EsRejectedExecutionException .
Index and Shard Management
Use shrink and rollover APIs, keep shard size under 20–50 GB depending on workload, and manage index naming (daily or monthly) to control shard count.
Segment Merge
Merge operations are I/O intensive; limit concurrent merge threads with index.merge.scheduler.max_thread_count: 1 on HDDs, while SSDs can use the default.
Auto‑generated _id
Prefer Elasticsearch‑generated IDs to avoid costly existence checks on user‑provided IDs.
Routing
Specify a routing key at index time to limit queries to a single shard, reducing merge and dispatch overhead.
Alias Usage
Expose services via index aliases instead of raw index names to enable seamless reindexing and zero‑downtime migrations.
Avoid Wide Tables
Limit total fields per index with index.mapping.total_fields.limit (default 1000) to prevent mapping explosion.
Avoid Sparse Indexes
Sparse fields increase storage and reduce compression efficiency; keep field definitions lean.
---
For more detailed explanations, refer to the official Elasticsearch documentation links provided throughout the guide.
---
Follow the public account "Top Architect" and reply with "Architecture" or "Architecture Clean" to receive a surprise gift package.
---
Join the "Top Architect" community group by scanning the QR code below; please include your name, company, and position when adding the contact.
Copyright statement: Content originates from the internet; all rights belong to the original author. If any infringement is found, please notify us for immediate removal.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.