Operations 17 min read

Understanding and Improving Elasticsearch Shard Balancing Strategies

This article analyzes Elasticsearch shard imbalance incidents, explains the built‑in shard balancing algorithm and its configuration parameters, demonstrates weight calculations with source code, and proposes practical improvements—including shard count adjustments and a custom load‑aware balancing tool—to achieve more effective cluster load distribution.

政采云技术

Jan 10, 2024

Understanding and Improving Elasticsearch Shard Balancing Strategies

From the Incident

During a morning peak, an Elasticsearch cluster raised many query‑timeout alerts, and monitoring showed that only node 123 had a large backlog of queries, with its I/O usage far exceeding other nodes.

Further investigation revealed that high‑load index shards (high QPS, large data, complex queries) were concentrated on node 123, causing severe load imbalance and occasional failures.

ES Shard Balancing Strategy

Concept

Elasticsearch aims to distribute index shards evenly across nodes to achieve load balancing and high availability. The goal is to keep the number of shards per node roughly equal.

Elasticsearch 的分片均衡是指将索引的分片均匀地分布在集群中的各个节点上，以实现负载均衡和高可用性。在 Elasticsearch 中，索引被分为多个分片，每个分片可以在不同的节点上进行存储和处理。分片均衡的目标是确保每个节点上的分片数量大致相等，避免某些节点负载过重，而其他节点负载较轻的情况。

Balancing Policy

The core weight calculation uses indexBalance and shardBalance factors. The weight for a node is computed as:

WeightFunction(float indexBalance, float shardBalance) {
    float sum = indexBalance + shardBalance;
    if (sum <= 0.0f) {
      throw new IllegalArgumentException("Balance factors must sum to a value > 0 but was: " + sum);
    }
    theta0 = shardBalance / sum;
    theta1 = indexBalance / sum;
    this.indexBalance = indexBalance;
    this.shardBalance = shardBalance;
  }

  float weight(Balancer balancer, ModelNode node, String index) {
    final float weightShard = node.numShards() - balancer.avgShardsPerNode();
    final float weightIndex = node.numShards(index) - balancer.avgShardsPerNode(index);
    return theta0 * weightShard + theta1 * weightIndex;
  }

The resulting formula is:

ShardWeightFactor × (current node shard count − ideal average) + IndexWeightFactor × (current node index shard count − ideal index average)

Related Configuration

cluster.routing.allocation.balance.shard: 0.45f (shard weight factor)

cluster.routing.allocation.balance.index: 0.55f (index weight factor)

cluster.routing.allocation.balance.threshold: 1.0f (rebalance trigger threshold)

cluster.routing.allocation.total_shards_per_node: -1 (no limit)

cluster.routing.allocation.cluster_concurrent_rebalance: 2 (max concurrent shard moves)

cluster.routing.allocation.node_concurrent_recoveries: 2 (per‑node concurrent moves)

Rebalance Enable Setting

public Decision canRebalance(ShardRouting shardRouting, RoutingAllocation allocation) {
  if (allocation.ignoreDisable()) {
    return allocation.decision(Decision.YES, NAME, "allocation is explicitly ignoring any disabling of rebalancing");
  }
  Settings indexSettings = allocation.metadata().getIndexSafe(shardRouting.index()).getSettings();
  final Rebalance enable;
  final boolean usedIndexSetting;
  if (INDEX_ROUTING_REBALANCE_ENABLE_SETTING.exists(indexSettings)) {
    enable = INDEX_ROUTING_REBALANCE_ENABLE_SETTING.get(indexSettings);
    usedIndexSetting = true;
  } else {
    enable = this.enableRebalance;
    usedIndexSetting = false;
  }
  switch (enable) {
    case ALL:
      return allocation.decision(Decision.YES, NAME, "all rebalancing is allowed");
    case NONE:
      return allocation.decision(Decision.NO, NAME, "no rebalancing is allowed due to %s", setting(enable, usedIndexSetting));
    case PRIMARIES:
      if (shardRouting.primary()) {
        return allocation.decision(Decision.YES, NAME, "primary rebalancing is allowed");
      } else {
        return allocation.decision(Decision.NO, NAME, "replica rebalancing is forbidden due to %s", setting(enable, usedIndexSetting));
      }
    case REPLICAS:
      if (!shardRouting.primary()) {
        return allocation.decision(Decision.YES, NAME, "replica rebalancing is allowed");
      } else {
        return allocation.decision(Decision.NO, NAME, "primary rebalancing is forbidden due to %s", setting(enable, usedIndexSetting));
      }
    default:
      throw new IllegalStateException("Unknown rebalance option");
  }
}

Disk Watermark Settings

cluster.routing.allocation.disk.threshold_enabled: true

cluster.routing.allocation.disk.watermark.low: "85%" (nodes above this are avoided)

cluster.routing.allocation.disk.watermark.high: "90%" (nodes above this trigger shard relocation)

public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
  ...
  if (freeDiskPercentage < diskThresholdSettings.getFreeDiskThresholdLow()) {
    return allocation.decision(Decision.NO, NAME, "the node is above the low watermark ...");
  }
  if (freeBytes < diskThresholdSettings.getFreeBytesThresholdHigh().getBytes()) {
    return allocation.decision(Decision.NO, NAME, "the shard cannot remain on this node because it is above the high watermark ...");
  }
  ...
}

Trigger Conditions

Even when the weight formula is satisfied, rebalance occurs only if the rebalance enable setting permits it (e.g., cluster.routing.rebalance.enable: all) and the node’s disk usage is below the configured watermarks.

How to Improve

Adjust Shard Count

Ensure the total number of primary shards is a multiple of the number of data nodes, allowing the built‑in balancer to distribute shards evenly.

Custom Balancing Tool

Design a tool that balances based on actual load metrics (CPU, I/O, network, RAM) rather than shard count alone. Define a load index for each node:

Ei = o × IOi + d × MBPSi + c × CPUi + r × RAMi

When a node’s load exceeds another by more than 50 % and its load is above 50 %, migrate a high‑load shard from the hot node to the cooler node, preferably during low‑traffic windows.

Future Outlook

Intelligent Balancing Strategies

Future tools may incorporate advanced algorithms to evaluate node load more accurately and decide shard movements.

Real‑time Monitoring and Feedback

More frequent monitoring, flexible alerts, and rapid response mechanisms will enable quicker detection of imbalance.

Dynamic Migration Windows

Automatically select low‑traffic periods for shard relocation to minimize impact on production workloads.

Customizable Policies

Allow users to tune thresholds, specify per‑index rules, and enable or disable automatic moves for critical indices.

Conclusion

Elasticsearch’s default shard balancer only equalizes shard counts, ignoring data size, hotness, and hardware differences, which can lead to performance degradation. Effective load balancing must consider actual resource usage, prioritize hot indices, and minimize migration overhead.

References

Elasticsearch source code.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Elasticsearch load balancing Cluster Operations shard balancing

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.