Operations 4 min read

How to Diagnose and Fix Elasticsearch Throttling Allocation Issues

This guide explains how to use the Elasticsearch GET /_cluster/allocation/explain API to identify throttling deciders, interpret the underlying allocation limits, and adjust persistent or transient cluster routing settings—such as node_concurrent_recoveries and indices.recovery.max_bytes_per_sec—to resolve shard allocation bottlenecks.

Practical DevOps Architecture
Practical DevOps Architecture
Practical DevOps Architecture
How to Diagnose and Fix Elasticsearch Throttling Allocation Issues

Use the GET /_cluster/allocation/explain API to view the current shard allocation details. The response shows a "deciders" array where a "throttling" decider with decision "THROTTLE" indicates that the node has reached its limit of outgoing shard recoveries, as defined by the setting

cluster.routing.allocation.node_concurrent_outgoing_recoveries

(default 2).

If the decider returns “throttling”, it usually means that the node’s recovery concurrency limit has been hit. When cluster resource utilization is low, you can increase the recovery concurrency parameters to speed up shard allocation; if utilization is high, consider decreasing them.

Key allocation settings include: cluster.routing.allocation.node_initial_primaries_recoveries: number of initial primary shard recoveries (default 2). cluster.routing.allocation.cluster_concurrent_rebalance: number of concurrent shard rebalances. cluster.routing.allocation.node_concurrent_recoveries: total concurrent recoveries per node.

cluster.routing.allocation.node_concurrent_incoming_recoveries

: concurrent incoming recoveries per node.

cluster.routing.allocation.node_concurrent_outgoing_recoveries

: concurrent outgoing recoveries per node. indices.recovery.max_bytes_per_sec: bandwidth limit for recovery (default 40mb).

Solution

Adjust the relevant parameters as needed. Increase the initial shard recovery count if appropriate, but keep the rebalance count modest to avoid impacting read/write performance; generally set concurrent recovery and allocation values to be less than or equal to the number of CPU cores on a node.

Persistent settings remain after a cluster restart, while transient settings are reset on restart. Use the PUT _cluster/settings API to apply changes. Example:

{
  "persistent": {
    "cluster.routing.allocation.node_concurrent_recoveries": 8,
    "cluster.routing.allocation.node_concurrent_incoming_recoveries": 8,
    "cluster.routing.allocation.node_initial_primaries_recoveries": 8,
    "cluster.routing.allocation.node_concurrent_outgoing_recoveries": 8,
    "cluster.routing.allocation.cluster_concurrent_rebalance": 8,
    "indices.recovery.max_bytes_per_sec": "60mb"
  },
  "transient": {
    "cluster.routing.allocation.node_concurrent_recoveries": 8,
    "cluster.routing.allocation.node_concurrent_incoming_recoveries": 8,
    "cluster.routing.allocation.node_initial_primaries_recoveries": 8,
    "cluster.routing.allocation.node_concurrent_outgoing_recoveries": 8,
    "cluster.routing.allocation.cluster_concurrent_rebalance": 8,
    "indices.recovery.max_bytes_per_sec": "60mb"
  }
}

If you only need a temporary change, modify the transient settings.

Elasticsearchthrottlingpersistent settingscluster allocationrecovery settingstransient settings
Practical DevOps Architecture
Written by

Practical DevOps Architecture

Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.