ClickHouse Cluster Expansion Using internal_replication with ReplicatedMergeTree and MergeTree Engines
This article explains how to expand a ClickHouse cluster by adding new replica nodes, detailing the use of the internal_replication parameter, configuration steps for ReplicatedMergeTree and MergeTree engines, and verification of data synchronization across replicas.
Expansion Approach
internal_replication parameter description
Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas).When set to true, write operations select a single healthy replica. For distributed tables whose local tables are ReplicatedMergeTree, this offloads data copying to the ReplicatedMergeTree engine, reducing load on the Distributed table.
If false (default), writes go to all replicas, causing the Distributed table to copy data itself, which may lead to inconsistency over time.
1. ReplicatedMergeTree ENGINE
ReplicatedMergeTree provides built‑in synchronization, so internal_replication should be true and Zookeeper handles replica coordination. Expansion steps:
Add the full cluster configuration of the expanded cluster to the new replica node.
Modify configuration files of existing replica nodes to include the new replica (hot‑update, no downtime).
Start the new replica and create corresponding local and distributed tables.
Zookeeper automatically syncs data from historical replicas to the new node.
2. MergeTree ENGINE
MergeTree lacks built‑in replica sync, so internal_replication must be false and the Distributed table copies data. New replicas do not receive historical data automatically; historical data is synced via backup/restore.
Add full cluster configuration to the new replica.
Update existing replicas' config to include the new node.
Start the new node and create local and distributed tables.
Export historical data from an existing replica and import it into the new node.
Case Validation
Environment
OS: CentOS 7.5 (4C4G); ClickHouse 21.8.4.51; Zookeeper 3.7.0. Three nodes (node1, node2, node3) with ports 9000 for ClickHouse and 2181 for Zookeeper.
ReplicatedMergeTree ENGINE (single shard, double replica)
1. Cluster Information
metrika.xml defines a cluster named test_action with internal_replication=true and two replicas (node1, node2).
<yandex>
<zookeeper-servers>
<node index="1">
<host>node1</host>
<port>2181</port>
</node>
</zookeeper-servers>
<remote_servers>
<test_action>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>node1</host>
<port>9000</port>
</replica>
<replica>
<host>node2</host>
<port>9000</port>
</replica>
</shard>
</test_action>
</remote_servers>
<macros>
<cluster>test_action</cluster>
<shard>1</shard>
<replica>node1</replica>
</macros>
</yandex>Macros differ per replica to identify them uniquely.
2. Adding a Replica
Created metrika.xml on node3 with an additional replica entry, updated node1 and node2 configs, and restarted node3.
<replica>
<host>node3</host>
<port>9000</port>
</replica>Cluster information now shows three replicas.
3. Synchronization
Created a ReplicatedMergeTree table on node3 and a Distributed table pointing to it.
create table table_test(
label_id UInt32,
label_name String,
insert_time Date
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/test_action/1/table_test','node3',insert_time, (label_id, insert_time), 8192);
CREATE TABLE table_test_all AS table_test ENGINE = Distributed(test_action, default, table_test, rand());Data inserted via the Distributed table replicated to all three nodes, confirming successful expansion.
MergeTree ENGINE (single shard, double replica)
Configuration is identical except internal_replication=false. After adding node3, local tables were created and historical data was exported from node1 as TSV and imported into node3 to achieve data consistency.
# Export from node1
clickhouse-client --query="select * from t_cluster where id < 5" > /var/lib/clickhouse/backup/t_cluster.tsv
# Import into node3
cat /tmp/t_cluster.tsv | clickhouse-client --query="insert into t_cluster FORMAT TSV"New inserts via the Distributed table synchronized across all replicas, while historical data required manual import.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.