Operations 17 min read

How to Recover a Failing Elasticsearch Cluster: Master Loss, Shard Corruption, and More

This guide explains Elasticsearch cluster architecture, node roles, and metadata storage, then details step‑by‑step recovery procedures for master‑node loss, complete master outage, data‑node failures, shard allocation problems, corrupted shards, translog issues, and missing segment files, including relevant API commands and tool usage.

dbaplus Community

Mar 5, 2024

How to Recover a Failing Elasticsearch Cluster: Master Loss, Shard Corruption, and More

1. Introduction

Elasticsearch is a distributed search engine that provides high availability under normal conditions but can become unusable in special failure scenarios such as loss of a majority of master‑eligible nodes, corrupted index shards, or data‑node failures that prevent nodes from rejoining the cluster.

2. Basic Knowledge

2.1 Classic Cluster Architecture

The typical production cluster consists of dedicated master‑eligible nodes, coordinating (gateway) nodes, and data nodes. Master nodes store cluster‑level metadata and must be an odd number (commonly three) to maintain quorum.

Elasticsearch classic cluster architecture

2.2 Node Roles

Master‑eligible node: Holds the master role, stores cluster metadata, and participates in master elections. The number of such nodes must be odd.

Coordinating (gateway) node: Routes client requests and can run ingest pipelines for preprocessing.

Data node: Stores shard data and executes CRUD, search, and aggregation operations. Data nodes can be hot, warm, or cold based on attributes.

Additional roles (remote‑eligible, ml‑node, transform node) require explicit configuration.

2.3 Cluster Metadata Storage

Metadata files are persisted under the following paths: nodes/0/_state/ – cluster‑level metadata. nodes/0/indices/{index_uuid}/_state/ – index‑level metadata. nodes/0/indices/{index_uuid}/0/_state/ – shard‑level metadata.

Since Elasticsearch 7.0.0 the metadata storage switched to Lucene.

3. Disaster Scenarios and Solutions

3.1 Master‑Node Loss (majority lost)

Pre‑7.0.0 :

Lower the quorum by setting discovery.zen.minimum_master_nodes (e.g., to 1) in elasticsearch.yml and restart the surviving nodes.

Re‑add the missing master‑eligible nodes after they are rebuilt.

Restore the original configuration and restart nodes, starting with the master‑eligible nodes.

7.0.0 and later :

Exclude the lost nodes from the voting configuration:

POST _cluster/voting_config_exclusions?node_names={node_names}

If a majority of masters is lost, run the unsafe‑bootstrap tool to form a new cluster: bin/elasticsearch-node unsafe-bootstrap Detach surviving data nodes from the old cluster: bin/elasticsearch-node detach-cluster Define new master nodes and seed hosts in elasticsearch.yml:

cluster.initial_master_nodes:
  - {master-0}
  - {new-master-1}
  - {new-master-2}
discovery.seed_hosts:
  - {master-ip-0}
  - {new-master-ip-1}
  - {new-master-ip-2}

3.2 All Master Nodes Lost

Pre‑7.0.0 : Disable X‑Pack security, start new master‑eligible nodes, and adjust discovery.zen.minimum_master_nodes and unicast hosts accordingly.

7.0.0 and later : Rebuild a new cluster from surviving data nodes using the unsafe‑bootstrap and detach‑cluster tools, then import any dangling indices (see Section 3.5).

3.3 Data‑Node Failure

If a data node cannot rejoin, copy its nodes/0/ data directory to a fresh node and add the new node to the cluster using the same steps as for master‑node loss.

3.4 Shard Allocation Issues

Diagnose allocation problems with the allocation‑explain API: POST _cluster/allocation/explain If shards are healthy, retry allocation: POST _cluster/reroute?retry_failed Manual allocation examples:

Allocate a replica:

POST /_cluster/reroute
{
  "commands": [
    {"allocate_replica": {"index": "{indexName}", "shard": {shardId}, "node": "{node}"}}
  ]
}

Allocate a stale primary (accept data loss):

POST /_cluster/reroute
{
  "commands": [
    {"allocate_stale_primary": {"index": "{indexName}", "shard": {shardId}, "node": "{node}", "accept_data_loss": true}}
  ]
}

Allocate an empty primary (data loss):

POST /_cluster/reroute
{
  "commands": [
    {"allocate_empty_primary": {"index": "{indexName}", "shard": {shardId}, "node": "{node}", "accept_data_loss": true}}
  ]
}

3.5 Corrupted Shards

Use the elasticsearch-shard tool (available from ES 6.5) to remove corrupted data:

bin/elasticsearch-shard remove-corrupted-data --index {indexName} --shard-id {shardId}

For translog corruption, add the --truncate-clean-translog flag:

bin/elasticsearch-shard remove-corrupted-data --index {indexName} --shard-id {shardId} --truncate-clean-translog

After fixing, restart the node and, if necessary, reroute the shard.

3.6 Missing segments_N Files

This is the most difficult scenario; no official tool exists yet and recovery requires custom investigation.

References

Elasticsearch cluster startup flow – [1]

Dangling indices API – https://elastic.co/guide/en/elasticsearch/reference/7.9/dangling-indices-list.html

Node tool documentation – https://elastic.co/guide/en/elasticsearch/reference/7.10/node-tool.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Operations Elasticsearch Shard Allocation Cluster Recovery Master Node Data Node

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.