Boost Elasticsearch Performance: Bulk API, Gateway & Caching Secrets
This article explains how to dramatically improve Elasticsearch throughput by using the bulk API, tuning bulk request sizes, configuring gateway settings, optimizing cluster state updates, managing caches, leveraging fielddata and doc values, and employing tools like Curator and the Profiler for efficient cluster operations.
Batch Submission
In the CRUD chapter we learned how data is written to Elasticsearch. Simple programs that index documents one by one achieve only a few hundred writes per second, far from Elasticsearch's potential. Each document requires a full HTTP POST request, which is inefficient, so Elasticsearch provides a bulk API for batch indexing and an mget API for batch reads.
The bulk request uses a line‑delimited format where each action line (metadata) is followed by the source document line. This format allows ES nodes to process each line without parsing a full JSON array, reducing memory usage and GC pressure. Production tools like Logstash, rsyslog, and Spark use the bulk API by default. For custom programs, Perl's Search::Elasticsearch::Bulk or Python's elasticsearch.helpers libraries are recommended.
Bulk Size
When configuring bulk indexing, the request body size must stay below http.max_content_length. However, the bulk size should not be set close to this limit because the entire request body must fit into JVM heap. Oversized requests can exhaust heap and degrade performance. A practical recommendation is to keep bulk request bodies around 15 MB, adjusting based on actual document size and testing.
Gateway
Elasticsearch stores index data via a gateway. By default the gateway.type is local, using local disks. Gateway settings can be tuned to improve recovery: gateway.recover_after_nodes: start recovery only after a certain number of nodes are available. gateway.recover_after_time: wait a configured time after the node count condition is met. gateway.expected_nodes: define the expected total node count before recovery begins.
More granular settings such as gateway.recover_after_data_nodes and gateway.recover_after_master_nodes are also available.
Shadow Replicas on Shared Storage
Although Elasticsearch discourages NFS/iscsi for the gateway, from version 1.5 onward it supports shadow replicas, which store index segments on shared storage without full replication. Enable it by setting node.enable_custom_paths: true and configuring the index with "shadow_replicas": true. Shadow replicas reduce write pressure on replica shards and avoid network copies during recovery, but the primary shard still writes data, so CPU savings are limited. For most cases, using the local gateway with snapshots to HDFS or other backup storage is preferred.
Cluster State Maintenance
The master node manages the cluster state, which includes cluster‑wide settings, node list, index mappings, and shard allocations. All nodes store a copy of the state and can retrieve it via /_cluster/state. Only the master can modify the state, and most changes are lightweight except mapping updates, which occur when new fields appear in documents. Bulk index creation (e.g., daily time‑based indices) can cause noticeable cluster‑state blocking under high load.
Bulk New Index Creation
Creating many new indices at once can block writes while the master propagates the updated state. Scheduling index creation during off‑peak hours (e.g., 3–4 am) mitigates this issue.
Excessive Field Updates
Storing every URL parameter as a separate field (e.g., via Logstash kv filter) inflates the mapping and consumes heap memory, potentially causing OOM. Using a nested object to store key/value pairs reduces mapping explosion but requires nested queries and aggregations.
Nested Object Example
{
"urlargs": [
{"key": "uid", "value": "1234567890"},
{"key": "action", "value": "payload"}
]
}When indexed as a nested type, queries must use the nested query to correctly match key/value pairs.
Cache
Filter Cache
Before ES 2.0, queries and filters were separate; filter results could be cached. Since 2.0, filters are merged into the query DSL, but the engine still distinguishes query (scoring) and filter (no scoring) contexts to decide cache usage. The filter cache is node‑level and configurable via indices.cache.filter.size (default 10% of heap).
Shard Request Cache
The shard request cache stores the results of immutable queries (e.g., on read‑only indices). It is effective when the request JSON does not change (e.g., time‑range queries where the range part is constant). The cache size is set with indices.requests.cache.size (default 1%).
Field Data
Fielddata (uninverted index) accelerates aggregations on text fields but consumes heap memory. Its size can be limited with indices.fielddata.cache.size and indices.fielddata.cache.expire (the latter should not be used). Elasticsearch also provides a circuit breaker ( indices.breaker.fielddata.limit) to prevent OOM.
Doc Values
Doc values store fielddata on disk for exact‑type fields, reducing heap usage. Since ES 5.0, text fields use fielddata, while keyword fields use doc values by default.
Enabling Fielddata on Text Fields
{
"mappings": {
"my_type": {
"properties": {
"message": {
"type": "text",
"fielddata": true,
"fielddata_frequency_filter": {
"min": 0.1,
"max": 1.0,
"min_segment_size": 500
}
}
}
}
}
}Curator
When indices exceed cluster capacity, the elasticsearch‑curator tool automates index deletion, closing, and optimization. Example commands:
curator --host 10.0.0.100 delete indices --older-than 5 --time-unit days --timestring '%Y.%m.%d' --prefix logstash-mweibo-nginx-
curator --host 10.0.0.100 close indices --older-than 7 --time-unit days --timestring '%Y.%m.%d' --prefix logstash-
curator --host 10.0.0.100 optimize --max_num_segments 1 indices --older-than 1 --newer-than 7 --time-unit days --timestring '%Y.%m.%d' --prefix logstash-These commands keep recent indices, close older ones, and force segment merges.
Profiler
Elasticsearch 5.0 introduced the profile API to break down query and aggregation execution times. Adding "profile": true to a search request returns detailed timing for collectors, rewrites, scoring, and aggregation phases, helping to tune collect_mode and execution_hint parameters.
Source: http://wangnan.tech/post/elkstack-es03/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Talk
Rooted in the "Dao" of architecture, we provide pragmatic, implementation‑focused architecture content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
