Databases 14 min read

ClickHouse Cluster Expansion Using internal_replication with ReplicatedMergeTree and MergeTree Engines

This article explains how to expand a ClickHouse cluster by adding new replica nodes, detailing the use of the internal_replication parameter, configuration steps for ReplicatedMergeTree and MergeTree engines, and verification of data synchronization across replicas.

Aikesheng Open Source Community

Nov 11, 2021

ClickHouse Cluster Expansion Using internal_replication with ReplicatedMergeTree and MergeTree Engines

Expansion Approach

internal_replication parameter description

Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas).

When set to true, write operations select a single healthy replica. For distributed tables whose local tables are ReplicatedMergeTree, this offloads data copying to the ReplicatedMergeTree engine, reducing load on the Distributed table.

If false (default), writes go to all replicas, causing the Distributed table to copy data itself, which may lead to inconsistency over time.

1. ReplicatedMergeTree ENGINE

ReplicatedMergeTree provides built‑in synchronization, so internal_replication should be true and Zookeeper handles replica coordination. Expansion steps:

Add the full cluster configuration of the expanded cluster to the new replica node.

Modify configuration files of existing replica nodes to include the new replica (hot‑update, no downtime).

Start the new replica and create corresponding local and distributed tables.

Zookeeper automatically syncs data from historical replicas to the new node.

2. MergeTree ENGINE

MergeTree lacks built‑in replica sync, so internal_replication must be false and the Distributed table copies data. New replicas do not receive historical data automatically; historical data is synced via backup/restore.

Add full cluster configuration to the new replica.

Update existing replicas' config to include the new node.

Start the new node and create local and distributed tables.

Export historical data from an existing replica and import it into the new node.

Case Validation

Environment

OS: CentOS 7.5 (4C4G); ClickHouse 21.8.4.51; Zookeeper 3.7.0. Three nodes (node1, node2, node3) with ports 9000 for ClickHouse and 2181 for Zookeeper.

ReplicatedMergeTree ENGINE (single shard, double replica)

1. Cluster Information

metrika.xml defines a cluster named test_action with internal_replication=true and two replicas (node1, node2).

<yandex>
  <zookeeper-servers>
    <node index="1">
      <host>node1</host>
      <port>2181</port>
    </node>
  </zookeeper-servers>
  <remote_servers>
    <test_action>
      <shard>
        <internal_replication>true</internal_replication>
        <replica>
          <host>node1</host>
          <port>9000</port>
        </replica>
        <replica>
          <host>node2</host>
          <port>9000</port>
        </replica>
      </shard>
    </test_action>
  </remote_servers>
  <macros>
    <cluster>test_action</cluster>
    <shard>1</shard>
    <replica>node1</replica>
  </macros>
</yandex>

Macros differ per replica to identify them uniquely.

2. Adding a Replica

Created metrika.xml on node3 with an additional replica entry, updated node1 and node2 configs, and restarted node3.

<replica>
  <host>node3</host>
  <port>9000</port>
</replica>

Cluster information now shows three replicas.

3. Synchronization

Created a ReplicatedMergeTree table on node3 and a Distributed table pointing to it.

create table table_test(
  label_id UInt32,
  label_name String,
  insert_time Date
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/test_action/1/table_test','node3',insert_time, (label_id, insert_time), 8192);
CREATE TABLE table_test_all AS table_test ENGINE = Distributed(test_action, default, table_test, rand());

Data inserted via the Distributed table replicated to all three nodes, confirming successful expansion.

MergeTree ENGINE (single shard, double replica)

Configuration is identical except internal_replication=false. After adding node3, local tables were created and historical data was exported from node1 as TSV and imported into node3 to achieve data consistency.

# Export from node1
clickhouse-client --query="select * from t_cluster where id < 5" > /var/lib/clickhouse/backup/t_cluster.tsv
# Import into node3
cat /tmp/t_cluster.tsv | clickhouse-client --query="insert into t_cluster FORMAT TSV"

New inserts via the Distributed table synchronized across all replicas, while historical data required manual import.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

ZooKeeper ClickHouse cluster Scaling MergeTree ReplicatedMergeTree

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.