Why ClickHouse Is the Ideal Choice for Massive Data Storage and Real‑Time Analytics
This article examines the massive‑scale data requirements of an activity‑tracking platform, compares MySQL, Elasticsearch and HBase, and explains why ClickHouse—with its columnar storage, MergeTree engine, vectorized execution, and distributed architecture—offers the best combination of storage capacity, write performance, real‑time analysis, and query speed for billions of records.
Background
The Magic Flute activity platform needs to record user behavior data for customer service, operations, product, and R&D, requiring storage and analysis of tens to hundreds of billions of rows, with real‑time write performance and analytical capabilities.
Store massive data
High write performance
Support real‑time analytics
Good query performance
Technology Options
2.1 MySQL Single Table
MySQL is single‑node; a single table can hold from a few hundred thousand to a few hundred million rows, far below the required tens of billions, so it is unsuitable.
2.2 MySQL Sharding
Sharding increases capacity and write throughput but adds resource consumption, complex OLAP queries, and latency due to distributed aggregation; it still struggles with tens of billions of rows.
2.3 Elasticsearch
Distributed search engine with inverted index, capable of storing large data and real‑time analysis, but suffers from write throughput bottlenecks, high latency, and higher storage cost.
2.4 HBase
Column‑family store built on HDFS, good write performance, but not a real‑time analytics engine, has limited query performance and high read latency for analytical workloads.
2.5 ClickHouse
ClickHouse is a high‑speed, scalable, low‑cost columnar DBMS supporting OLAP, real‑time analytics, and massive data volumes. It uses SQL, supports many data formats, and offers excellent compression and partitioning.
ClickHouse Detailed Introduction
3.1 Origin
Developed by Yandex in 2016, written in C++, originally for Yandex.Metrica click‑stream analytics.
3.2 Architecture
Multi‑Master architecture with equal‑role nodes; supports both single‑master and multi‑master modes, horizontal scaling via segment nodes, and automatic sharding.
3.3 Features
3.3.1 Columnar Storage
Compared with row storage, columnar storage stores each column’s values contiguously, reducing I/O for queries that need only a subset of columns and enabling better compression.
3.3.2 Complete DBMS Functions
DDL/DML support
Fine‑grained permission control
Backup and restore
Distributed cluster management
3.3.3 Data Compression
Default LZ4 compression achieves up to 8:1 ratio; compression works on blocks of 8192 rows, reducing CPU overhead.
压缩前:ABCDE_BCD 压缩后:ABCDE_(5,3)3.3.4 Vectorized Execution Engine
Eliminates loops by using SIMD instructions; example:
for (size_t i = 0; i < 100; ++i) c[i] = a[i] + b[i]; c[0] = a[0] + b[0];
c[1] = a[1] + b[1];
...3.3.5 Sharding and Distributed Queries
Data is horizontally sharded across nodes; local tables hold data shards, while Distributed tables act as proxies to query all shards in parallel.
3.3.6 Multithreading
Combines vectorized execution (data‑level parallelism) with multithreaded processing (core‑level parallelism) for high throughput.
3.3.7 Relational Model and Standard SQL
Supports standard SQL (SELECT, GROUP BY, JOIN, etc.) and many storage engines; MergeTree family provides primary key, partitioning, and sampling.
3.5 Table Engine – MergeTree
MergeTree supports primary‑key sorting, partitioning, replication, and sampling. Example creation:
CREATE TABLE IF NOT EXISTS db.table (
col1 UInt64,
col2 String
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (col1, date);Key parameters: PARTITION BY, ORDER BY, optional PRIMARY KEY, SAMPLE BY, and SETTINGS such as index_granularity.
3.6 Indexes
3.6.1 Primary (Sparse) Index
One index entry per index_granularity rows (default 8192), stored in primary.idx, enabling fast range scans.
3.6.2 Secondary (Skip) Indexes
Optional skip indexes (minmax, set, ngrambf_v1, tokenbf_v1) reduce scan ranges further.
3.6.3 Index Granularity
Controls both primary index and data block size; smaller granularity gives finer‑grained pruning at the cost of more index overhead.
3.7 Query and Write Paths
Queries use partition, primary, and secondary indexes plus mark files to minimize data read. Writes generate new partition directories, primary index entries, and compressed data blocks; background merges compact data.
ClickHouse Application Scenarios
Data warehouses and BI reporting
Real‑time analytics and monitoring
Time‑series storage
Data visualization dashboards
Pros and Cons
Advantages
High‑speed columnar storage with strong compression
Vectorized execution and multithreading
Distributed architecture with automatic sharding
Rich SQL support and mature DBMS features
Drawbacks
No full ACID transactions
No built‑in full‑text tokenization
Limited low‑latency row‑level updates/deletes
Joins are not a strength
Not optimized for very high QPS workloads (recommended ~100 QPS)
Why Queries Are Fast
ClickHouse minimizes disk I/O through columnar storage, pre‑sorting, compression, vectorized functions, and avoiding joins; it performs GROUP BY in memory using hash tables and leverages multithreading and distributed execution.
Why Writes Are Fast
Writes are appended sequentially in large blocks (LSM‑tree‑like), compressed, and merged in the background, achieving 50‑200 MB/s (≈500 k‑2 M rows/s) on commodity hardware.
-end-
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
