Databases 6 min read

How ClickHouse Turns MySQL Bottlenecks into Sub‑Second OLAP Queries

This article introduces ClickHouse, compares column‑store and row‑store databases, shows how migrating a 50‑million‑row MySQL table to ClickHouse reduced query time from minutes to under one second, and shares practical installation, migration, performance testing, and synchronization tips.

21CTO
21CTO
21CTO
How ClickHouse Turns MySQL Bottlenecks into Sub‑Second OLAP Queries

What is ClickHouse?

ClickHouse is an open‑source column‑store database from Yandex designed for real‑time analytical queries. It processes data 100‑1000× faster than traditional row‑based systems, handling billions of rows per second per server.

Fundamental Concepts

Distinguish OLTP (transactional, row‑oriented) from OLAP (analytical, column‑oriented). Columnar storage reads only required columns, reducing I/O, while row‑based stores entire rows.

Business Problem

A MySQL table with 50 million rows required more than three minutes for a join query. After indexing, sharding, and logical optimization the improvement was limited, so ClickHouse was introduced, reducing query time to under one second—a 200‑fold speedup.

ClickHouse Practice

1. Installing ClickHouse on macOS

Installation can be done via Docker or by compiling from source.

2. Migrating Data from MySQL to ClickHouse

ClickHouse supports most MySQL syntax, offering five migration methods: engine mapping, INSERT‑SELECT, CREATE‑TABLE‑AS‑SELECT, CSV import, and StreamSets. The article uses the CREATE‑TABLE‑AS‑SELECT approach:

CREATE TABLE [IF NOT EXISTS] db.table_name ENGINE = MergeTree AS SELECT * FROM mysql('host:port','db','database','user','password')

3. Performance Test Comparison

For a 50 million‑row dataset (≈10 GB in MySQL, 600 MB in ClickHouse):

MySQL query time: 205 seconds

ClickHouse query time: under 1 second

4. Data Synchronization Strategies

1) Temporary table: load full MySQL data into a ClickHouse temporary table, then replace the target table. Suitable for moderate data volumes with frequent incremental changes.

2) synch: an open‑source tool that reads MySQL binlog, converts statements to tasks, and streams them via a message queue.

5. Why ClickHouse Is Fast

Only the required columns are read, reducing I/O.

Same‑type column storage enables high compression (up to 10×).

Custom storage‑aware algorithms further accelerate queries.

Pitfalls Encountered

1. Data Type Differences Between ClickHouse and MySQL

Example: a query error can be resolved by casting IDs to unsigned integers, e.g., LEFT JOIN B b ON toUInt32(h.id) = toUInt32(ec.post_id).

2. Asynchronous Deletes/Updates

ClickHouse’s MergeTree engine guarantees eventual consistency; for strict consistency, perform full data synchronization.

Conclusion

ClickHouse eliminated the MySQL query bottleneck, delivering sub‑second responses for datasets up to two billion rows and scaling to clusters for larger volumes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data MigrationPerformance OptimizationClickHousemysqlOLAPColumnar Database
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.