Elasticsearch Write, Read, Search Processes and Performance Tuning Guide
This article explains Elasticsearch's data ingestion, retrieval, and search workflows, details the underlying indexing mechanisms, and provides comprehensive system‑level, shard‑level, and query‑level tuning recommendations—including configuration snippets and best‑practice strategies for high‑throughput and low‑latency deployments.
Elasticsearch writes data by having the client select a coordinating node, which routes the request to the primary shard node; the primary shard writes the document and replicates it to replica shards, then the coordinating node returns the response to the client.
Read operations follow a similar pattern: the client contacts a coordinating node, which selects a target node (primary or replica) using round‑robin, retrieves the document, and forwards it back to the client.
Search queries are dispatched from the coordinating node to all relevant primary or replica shards; each shard returns matching doc IDs, which the coordinating node merges, sorts, paginates, and finally fetches the full documents before responding.
Internally, documents first enter a buffer and translog; when the buffer reaches a threshold or after a timed interval, Elasticsearch refreshes the buffer into a new segment file (default every second), making the data searchable via the OS cache.
System‑level tuning emphasizes allocating sufficient JVM heap (typically 31‑32 GB for large‑memory nodes), disabling swap, and locking memory (bootstrap.memory_lock: true) to avoid performance penalties.
Shard and replica configuration advises using the default 5 primary shards, adding 1‑3 replicas based on fault‑tolerance needs, and keeping the total shard count per node reasonable (e.g., index.routing.allocation.total_shards_per_node: 2).
Key Elasticsearch settings for performance include adjusting index.merge.scheduler.max_thread_count, increasing indices.memory.index_buffer_size, setting index.refresh_interval to a larger value (e.g., 30s) during bulk loads, and using asynchronous translog writes:
{
"index.translog": {
"sync_interval": "120s",
"durability": "async",
"flush_threshold_size": "1g"
}
}Bulk indexing dramatically improves write throughput; determine the optimal bulk size by incremental testing (e.g., 100, 200, 400 documents) while ensuring each request stays below ~10 MB.
Multi‑threaded bulk ingestion further saturates cluster resources, but monitor for EsRejectedExecutionException to avoid overloading.
Query performance tips include consolidating multiple fields into a single copy‑to field, avoiding scripts when possible, limiting result set sizes, using pre‑aggregated fields (e.g., age_group), and configuring caches (query cache, field data cache, shard request cache) appropriately.
Example script field that should be avoided:
{
"script_fields": {
"test1": {
"lang": "groovy",
"script": "while(true){print 'don\'t use script'}"
}
}
}Linux tuning recommendations cover increasing the maximum open file handles (e.g., nofile 65535) and adjusting vm.dirty_ratio and vm.dirty_background_ratio to control write‑back behavior:
sysctl -w vm.dirty_ratio=10
sysctl -w vm.dirty_background_ratio=5Overall, combining proper hardware (SSD, ample RAM), OS cache sizing, and the above Elasticsearch configurations yields a high‑performance, scalable search platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
