Unlock ClickHouse’s Secret Weapons: The 9 Techniques Behind Lightning‑Fast Queries
This article explores ClickHouse’s high‑performance OLAP architecture, covering its MPP design, columnar storage, vectorized execution, pre‑sorting, table engines, data types, sharding and replication strategies, as well as index designs that together enable rapid analysis of massive datasets.
Introduction
In the era of big data, the explosive growth of data volume demands efficient processing and analysis. ClickHouse, an open‑source distributed OLAP system developed by Yandex, offers a powerful solution for real‑time analytics on massive datasets.
ClickHouse “Nine Swords”
Overall Architecture
ClickHouse separates storage and query processing layers, providing innovations in both to achieve extremely fast inserts and selects. Unlike other engines that treat computation as a service on external storage, ClickHouse’s native storage layer enables many query‑time optimizations, effectively making storage serve computation.
It adopts a massive parallel processing (MPP) architecture where each node is peer‑to‑peer and can independently serve queries, allowing efficient distributed query execution by parallelizing tasks across nodes.
Primary key pre‑sorting, columnar storage, and a vectorized engine further ensure rapid query performance.
Columnar Storage
Columnar storage, widely used in big data (e.g., Parquet, ORC), stores each column’s data in separate files, allowing queries to read only relevant columns and dramatically reducing the amount of data scanned.
This design also yields high compression ratios—often 8:1—because similar values within a column compress efficiently. During queries, only the needed column blocks are decompressed, further cutting I/O.
Vectorized execution processes data in batches (e.g., 1024 elements) rather than one row at a time, leveraging SIMD instructions to accelerate computation.
ClickHouse’s vectorized engine uses modern CPU SIMD to handle multiple data elements per instruction, reducing cache misses and boosting single‑node performance. This approach enables billions of rows to be processed on a single server, lowering cost and complexity compared to traditional data warehouses.
Pre‑Sorting
ClickHouse writes data to disk after sorting it using an LSM‑like algorithm, ensuring on‑disk order. Background compaction further maintains this order, allowing sequential writes that fully utilize disk throughput.
When creating a table, a primary key and sorting key must be specified; the primary key is either the sorting key or its prefix. Ordered data reduces the amount of data read for range queries, enhancing speed.
Table Engines
Different table engines optimize storage and query performance for various scenarios. They define how data is stored, written, read, indexed, accessed concurrently, replicated, and whether multithreading is supported.
Data storage method and location, write destination, and read source;</code>
<code>Supported query types and how they are handled;</code>
<code>Concurrent data access;</code>
<code>Index usage (if any);</code>
<code>Multithreaded request execution;</code>
<code>Data replication parameters;Data Types
ClickHouse supports over 100 data types, each influencing in‑memory layout, on‑disk serialization, and computation. Columns are immutable arrays; operations produce new column objects, enabling parallel processing.
Aligned types store only the data array, while non‑aligned types also keep an offset array, improving both storage and compute efficiency.
Sharding and Replication Strategy
ClickHouse partitions data horizontally into shards and vertically into replicas. Sharding distributes data across nodes to improve query parallelism, while replicas provide fault tolerance and load balancing.
Shard keys (integer) can use fixed fields, random functions, or hash functions. Replicas are selected via load‑balancing strategies such as Random (default), Nearest hostname, Hostname Levenshtein distance, In Order, First or Random, and Round Robin.
Index Design
ClickHouse relies on sparse primary key indexes and skip indexes (minmax, set, Bloom filter) to reduce I/O. The primary key index stores a mark for each granularity (default 8192 rows), enabling fast range scans while keeping the index small enough to fit in memory.
Skip indexes further prune data blocks: minmax stores per‑block min/max values, Bloom filters efficiently test element existence for high‑cardinality columns, and set indexes handle unordered collections.
Computation Engine
ClickHouse’s computation engine translates SQL into physical plans and executes them using multithreading and distributed query processing. While it lacks a sophisticated optimizer and has limited JOIN support, its vectorized execution and MPP architecture deliver high performance.
Conclusion
ClickHouse’s impressive query speed results from deliberate architectural choices: columnar storage, vectorized execution, aggressive compression, and a distributed MPP design. These innovations collectively enable ClickHouse to excel in large‑scale data analysis, making it a standout solution in the big‑data landscape.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
