Databases 6 min read

ClickHouse Overview, MySQL Migration, Performance Benchmark, and Practical Tips

This article introduces ClickHouse as an OLAP columnar database, explains the differences between row‑ and column‑oriented storage, details a real‑world migration from MySQL using a CREATE TABLE AS SELECT approach, presents performance comparisons, discusses synchronization methods, and shares common pitfalls and solutions.

Architecture Digest
Architecture Digest
Architecture Digest
ClickHouse Overview, MySQL Migration, Performance Benchmark, and Practical Tips

1. What is ClickHouse?

ClickHouse is a column‑oriented DBMS designed for online analytical processing (OLAP).

Key concepts:

OLTP – traditional relational databases focusing on transaction consistency (e.g., banking, e‑commerce).

OLAP – warehouse‑type databases for complex data analysis and decision support.

Row‑oriented databases (MySQL, Postgres, MS SQL Server) store data row by row, while ClickHouse stores data column by column, reducing I/O and enabling high compression.

2. Business Problem

A MySQL table with 50 million rows required over 3 minutes for a join query. After indexing and sharding, performance was still unsatisfactory, prompting a migration to ClickHouse.

After migration, query time dropped to under 1 second, a 200× speed‑up.

3. ClickHouse Practice

3.1 Installing ClickHouse on macOS

Installation via Docker is recommended; alternatively, compile from source.

3.2 Data Migration: MySQL → ClickHouse

ClickHouse supports most MySQL syntax, offering five migration strategies. The article uses the "CREATE TABLE AS SELECT" method:

CREATE TABLE [IF NOT EXISTS] db.table_name ENGINE = MergeTree AS SELECT *
FROM mysql('host:port','db','database','user','password')

3.3 Performance Test Comparison

Type

Data Volume

Table Size

Query Speed

MySQL

50 M

10 GB

205 s

ClickHouse

50 M

600 MB

<1 s

3.4 Data Synchronization Solutions

Temporary Table Method

Create a temporary table in ClickHouse, bulk‑load MySQL data, then replace the original table. Suitable for moderate data volumes with frequent incremental changes.

Synch Tool

Open‑source tool synch captures MySQL binlog statements and forwards them via a message queue for eventual consistency.

3.5 Why ClickHouse Is Fast

Only reads columns required for computation, avoiding full‑row I/O.

Same‑type column storage enables up to ten‑fold compression, further reducing I/O.

Custom storage‑aware search algorithms optimize data access patterns.

4. Pitfalls Encountered

4.1 Data Type Differences Between ClickHouse and MySQL

Queries may fail due to mismatched types; casting to a common unsigned type (e.g., toUInt32 ) resolves the issue.

4.2 Asynchronous Delete/Update Operations

Even MergeTree guarantees only eventual consistency. For strict consistency, perform full‑table synchronization.

5. Conclusion

Using ClickHouse solved the MySQL query bottleneck: queries on sub‑billion‑row datasets return within 1 second, and the system scales to clusters for larger workloads.

References:

ClickHouse official documentation: https://clickhouse.tech/docs/zh/

ClickHouse case study at Ctrip Hotels

Choosing ClickHouse engines

PerformanceClickHouseOLAPdatabase migrationColumnar Database
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.