Operations 16 min read

How to Seamlessly Migrate Elasticsearch from Cloud to On‑Premises Without Downtime

This article walks through a practical, step‑by‑step migration of an Elasticsearch cluster from a public‑cloud environment to a self‑hosted data‑center, covering strategy, configuration changes, node role separation, manual data transfer, and post‑migration re‑enabling of automatic balancing to ensure a smooth, low‑impact transition.

dbaplus Community

Jun 6, 2020

How to Seamlessly Migrate Elasticsearch from Cloud to On‑Premises Without Downtime

Preface

Elasticsearch automatically balances shard load when nodes join or leave a cluster. The author, an Elastic‑Stack power user and ES‑certified engineer, shares a real‑world migration from a public‑cloud Elasticsearch cluster to a self‑built data‑center while keeping services available.

Background

The existing big‑data platform runs on a public cloud; Elasticsearch serves most external queries and some real‑time compute. Business requirements demand moving the cluster to an on‑premises environment without degrading user experience.

Custom API services tightly coupled with the Elasticsearch cluster.

Full migration of the Elasticsearch cluster, including data and nodes.

Migration Strategy

The migration emphasizes stability over speed and follows these high‑level actions:

Disable automatic shard rebalancing.

Start new on‑premises nodes and join them to the existing cluster.

Manually move data to the new nodes.

Switch external traffic to the new nodes.

Shut down the public‑cloud nodes.

Re‑enable automatic balancing.

Migration Steps

All steps must be performed in strict order to avoid cluster turbulence.

1. Original Cluster Architecture

Separate master‑eligible nodes from data nodes to avoid performance bottlenecks.

# Master node settings
node.master: true
node.data: false

# Data node settings
node.master: false
node.data: true

2. Configure New Cluster

New nodes also separate master and data roles. Hosts point to both old and new master nodes.

# Master discovery
 discovery.zen.ping.unicast.hosts: ["old_master_ip:port", "new_master_ip:port"]

Allocate CPU resources per instance on physical servers:

# Processors per instance
 processors: <= (CPU_cores / instance_count)

3. Disable Cluster Auto‑Balancing

# Disable new index allocation
cluster.routing.allocation.enable: false
# Disable shard rebalance
cluster.routing.rebalance.enable: false

4. Start New Data Nodes

After disabling balancing, safely start all new data nodes. Assign custom attributes for later operations.

# Node attributes
node.attr.rack: rack1
node.attr.zone: zone1
node.attr.disk: ssd1

5. Switch Cluster Access

Three consumer groups need updating:

Hadoop‑ES connector: recreate Hive‑ES mapping tables with new IPs.

Custom API services: point proxy to new data nodes.

Real‑time compute (Kafka): start consumers on new nodes, then stop old ones.

6. Manual Data Transfer

Reasons for manual transfer:

Limited cross‑segment bandwidth.

Large indices (hundreds of GB) would overload I/O.

Co‑existence of old and new nodes would cause excessive network traffic.

Prioritise small, offline, or low‑query‑frequency indices and control parallelism to stay within bandwidth limits.

# Restrict index allocation to new nodes
"index.routing.allocation.include._ip": "new_node_ip1,new_node_ip2"

# Restrict old indices to old nodes
"index.routing.allocation.include._ip": "old_node_ip1,old_node_ip2"

7. Shut Down Old Data Nodes

Gradually power off old data nodes after confirming data migration and traffic switch.

8. Start New Master Nodes

Activate new master‑eligible nodes one by one, retiring old standby masters to avoid split‑brain scenarios.

9. Re‑enable Auto‑Balancing

# Re‑enable allocation and rebalance
cluster.routing.allocation.enable: true
cluster.routing.rebalance.enable: true

Key Elasticsearch Concepts Used

Cluster elasticity : Nodes can join or leave without service interruption.

Master election : Only one master‑eligible node is active; others are standby.

Node roles : Master, Data, Ingest, Coordinating, Voting, Machine Learning.

Routing : Requests are forwarded to the shard’s residing node.

Shards and replicas : Enable granular migration and reduce I/O load.

Conclusion

After migration, the on‑premises cluster delivered significantly higher throughput, multiple‑times faster parallel writes, and reduced query‑write interference. The experience shows that while Elastic’s documentation is thorough, practical migration requires hands‑on experimentation, deep understanding of shard allocation, and careful sequencing of operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Operations Elasticsearch Cluster Migration data balancing on-premises

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.