Unlocking ClickHouse’s Lightning‑Fast Queries: The ‘Nine Swords’ Architecture Explained
This article explores ClickHouse’s high‑performance OLAP design—including its MPP architecture, columnar storage, vectorized execution, pre‑sorting, sharding, replication, index strategies, and compute engine—showing how each innovation contributes to ultra‑fast, scalable data analysis in the big‑data era.
Introduction
In the era of big data, data volumes explode, making efficient processing and analysis critical. ClickHouse, an open‑source distributed OLAP system created by Yandex, delivers real‑time analytical capabilities for massive datasets.
Overall Architecture
ClickHouse separates storage and query processing layers. The storage layer holds, loads, and maintains tables, while the query layer executes user queries. Unlike many engines that treat storage as a service, ClickHouse’s native storage enables many query‑time optimizations.
It adopts an MPP (massively parallel processing) architecture where each node is a peer that can serve queries independently, allowing distributed query execution across multiple servers.
Data is pre‑sorted by primary key, stored column‑wise, and processed with a vectorized engine, all of which accelerate query performance.
Columnar Storage
ClickHouse stores each column in a separate file, allowing queries to read only the columns they need, dramatically reducing I/O.
Columnar layout also improves compression; similar values within a column compress efficiently, often achieving an 8:1 compression ratio.
During query execution, only the required column blocks are decompressed, further speeding up reads.
Vectorization
Vectorized execution processes data in batches (e.g., 1024 elements) instead of one row at a time, leveraging SIMD instructions to handle multiple data items per CPU cycle.
This batch processing improves cache locality and reduces CPU cache misses, delivering significant speed gains for analytical workloads.
Pre‑sorting
ClickHouse writes data in a LSM‑like fashion, sorting it before persisting to disk and performing background compaction, which maximizes sequential disk writes.
Tables require a primary key and an optional sorting key; the primary key must be a prefix of the sorting key.
Sorted data reduces the amount of data read for range scans and ordering operations, boosting query speed.
Table Engines
Different table engines provide specialized storage and access patterns, influencing where data is stored, which queries are supported, concurrency handling, indexing, multithreading, and replication settings.
Data storage method and location, write/read paths;
Supported queries and how they are executed;
Concurrent data access;
Index usage (if any);
Multithreaded request handling;
Replication parameters;Data Types
ClickHouse supports over 100 data types, each defining in‑memory layout, on‑disk serialization, and computation behavior.
Columns are immutable arrays; operations create new column objects, enabling parallel processing.
Aligned data types store only the data array, while unaligned types also keep an offset array, improving storage and compute efficiency.
Sharding and Replication
Data is divided horizontally into shards and vertically into replicas. Sharding distributes data across nodes to improve query parallelism; replicas store identical copies for fault tolerance.
Shard placement can use fixed fields, random functions, or hash functions. Replicas are selected for query execution via load‑balancing strategies such as Random, Nearest hostname, Hostname Levenshtein distance, In Order, First or Random, and Round Robin.
Index Design
ClickHouse uses sparse primary‑key indexes (one entry per granule, default 8192 rows) stored in separate index files, enabling fast location of relevant data blocks.
Additional skip indexes—minmax, set, and Bloom filter—allow the engine to bypass blocks that cannot satisfy a predicate, reducing scan volume.
Compute Engine
The compute engine translates SQL into physical plans and executes them using multithreading and distributed query processing.
While highly parallel, ClickHouse lacks a sophisticated optimizer and has limited JOIN support, which some consider a shortcoming.
Conclusion
The combination of columnar storage, vectorized execution, aggressive compression, MPP architecture, and flexible sharding/replciation makes ClickHouse exceptionally fast for large‑scale analytical workloads, positioning it as a leading solution in the big‑data landscape.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
