Databases 9 min read

How to Migrate 100 Billion ClickHouse Rows to Doris: Three Practical Strategies

When a ClickHouse cluster needed to be decommissioned, the author evaluated three migration approaches—using Doris' ClickHouse catalog, exporting to files with Broker/Stream Load, and leveraging Spark—to move roughly 100 billion rows to Doris, comparing their complexity, reliability, and performance.

ITPUB
ITPUB
ITPUB
How to Migrate 100 Billion ClickHouse Rows to Doris: Three Practical Strategies

Background

A ClickHouse cluster was scheduled for shutdown after project completion, leaving about 100 billion rows of potentially reusable data that needed to be backed up and migrated to Doris (CK version 25.x, Doris version 2.1.2).

Migration Options

Based on experience and AI suggestions, three feasible methods were identified:

Use Doris' ClickHouse catalog and execute INSERT INTO … SELECT * to copy data directly.

Export each ClickHouse table to a file (CSV/TSV/JSON/Parquet) and load the files into Doris via Broker Load or Stream Load.

Read ClickHouse tables with Spark and write the results into Doris target tables.

Evaluation of Options

Option 1 is the simplest and most efficient but suffers from the most bugs; it works for local small tables but fails for partitioned or large tables. Option 2 requires two steps, is a bit more cumbersome, but has fewer bugs than option 1. Option 3 is the most robust and has been repeatedly verified, yet it demands additional coding and compute resources.

Testing Option 1

First, create a ClickHouse catalog in Doris:

CREATE CATALOG `jdbc_clickhouse01` PROPERTIES (
  "user" = "default",
  "type" = "jdbc",
  "password" = "***",
  "jdbc_url" = "jdbc:clickhouse://192.168.xxx.xxx:8123",
  "driver_url" = "clickhouse-jdbc-0.4.6-all.jar",
  "driver_class" = "com.clickhouse.jdbc.ClickHouseDriver"
);

Note: the ClickHouse JDBC jar must be placed in /usr/share/java on each Doris node if the servers cannot access the internet.

After switching to the catalog, the tables become visible. An INSERT INTO … SELECT * works for a small local table (225 rows), but fails for larger or partitioned tables, producing errors such as IndexOutOfBoundsException.

Testing Option 2

Export a ClickHouse table to TSV (tab‑separated CSV):

clickhouse-client --query="SELECT * FROM ck_db.ck_big_table FORMAT TSV" > /tmp/ck_big_table.tsv

Transfer the file to a Doris client node and load it via Stream Load with a tab separator:

curl --location-trusted -u doris_user:**** \
  -H "label:label01" \
  -H "column_separator:\t" \
  -T ck_big_table.tsv \
  http://192.168.xxx.xxx:8030/api/doris_db/ck_big_table/_stream_load
{
  "TxnId": 13429556,
  "Label": "label01",
  "Status": "Success",
  "NumberTotalRows": 1406520,
  "NumberLoadedRows": 1406520,
  "LoadTimeMs": 11456,
  "ReadDataTimeMs": 7769,
  "WriteDataTimeMs": 11165
}

CSV with commas was avoided because many fields contain commas, which would corrupt the import. JSON was not used because it requires an explicit schema definition on the command line.

CK export formats
CK export formats
Doris import formats
Doris import formats

Performance

Exporting 1.4 million rows from ClickHouse took less than 2 seconds. Loading the same amount into Doris via Stream Load completed in under 20 seconds, demonstrating that both export and import are fast for moderate data volumes.

Conclusion

There is no single “high‑efficiency, simple, elegant” solution for heterogeneous database migration. In practice, one must combine multiple strategies: create Doris tables that match ClickHouse schemas, use the ClickHouse catalog with Stream Load for straightforward tables, and fall back to Spark for complex cases.

SQLClickHousedorisCatalogStream Load
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.