Databases 13 min read

Elasticsearch Performance Tuning Guide – Cluster State Size, Topology, and Memory Lock

The guide explains how to optimize Elasticsearch performance by managing cluster state size through shard planning, assigning dedicated master, data, ingest, and coordinating node roles, and preventing memory swapping with bootstrap.memory_lock and proper heap and OS settings.

vivo Internet Technology

Dec 11, 2017

Elasticsearch Performance Tuning Guide – Cluster State Size, Topology, and Memory Lock

This article, translated from the QBox official blog series "Elasticsearch Performance Tuning Authoritative Guide", introduces the basic principles and practical strategies for tuning Elasticsearch performance, covering cluster topology, shard and replica planning, capacity planning, and memory optimization.

1. Cluster State Size (Index and Shard Capacity Planning) – The number of shards influences cluster state size. Too many shards increase management overhead and can turn the cluster health red. The Cluster State API returns comprehensive information about indices, mappings, routing tables, and more. Example of a cluster state response (truncated) is shown below:

{
  "cluster_name": "elasticsearch",
  "version": 6,
  "state_uuid": "skxF0gCYTAGQAUU-ZW4_GQ",
  "master_node": "VyKDGurkQiygV-of4B1ZAQ",
  "nodes": {"VyKDGurkQiygV-of4B1ZAQ": {"name": "Siege", "transport_address": "127.0.0.1:9300"}},
  "metadata": {"templates": {...}, "indices": {...}},
  "routing_table": {...},
  "routing_nodes": {...}
}

To reduce the size, you can filter the response using the metrics parameter (e.g., version,master_node,nodes,routing_table,metadata,blocks) or request the state from a specific node with local=true.

2. Elasticsearch Cluster Topology – Nodes can assume different roles:

Master‑eligible nodes : manage cluster‑level operations. Example configuration:

node.master: true
node.data: false
node.ingest: false

Data nodes : store shard data and handle CRUD, search, and aggregations. Example configuration:

node.master: false
node.data: true
node.ingest: false

Ingest nodes : run ingest pipelines. Example configuration:

node.master: false
node.data: false
node.ingest: true

Coordinating (client) nodes : route requests and perform the reduce phase. Example configuration:

node.master: false
node.data: false
node.ingest: false

Separating these roles improves stability and allows fine‑grained resource allocation.

3. Disabling Memory Swapping – Swapping degrades performance and can cause long GC pauses. Elasticsearch provides the bootstrap.mlockall (pre‑5.x) or bootstrap.memory_lock (5.x+) setting to lock the JVM heap in memory. Add the following to config/elasticsearch.yml: bootstrap.mlockall: true In newer versions use: bootstrap.memory_lock: true After restarting the node, verify the setting: curl -XGET localhost:9200/_nodes?filter_path=**.mlockall Expected response:

{"nodes":{"VyKDGurkQiygV-of4B1ZAQ":{"process":{"mlockall":true}}}

If the response shows false, the lock failed, often due to insufficient OS permissions. Resolve by granting the Elasticsearch user the memlock limit (e.g., ulimit -l unlimited or setting MAX_LOCKED_MEMORY=unlimited in systemd).

Additional ways to avoid swapping:

Set the JVM heap consistently with ES_HEAP_SIZE or matching -Xms and -Xmx values, e.g., export ES_HEAP_SIZE=10g.

Temporarily disable swap with sudo swapoff -a or permanently comment out swap entries in /etc/fstab.

Reduce kernel swap tendency by setting vm.swappiness=1 via sysctl.

Following these guidelines helps maintain a healthy Elasticsearch cluster with optimal performance.

memory management Elasticsearch Performance Tuning Cluster Topology Node Configuration

Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.