Databases 7 min read

An Overview of ClickHouse: Features, Performance, Use Cases, and Limitations

ClickHouse is a column‑oriented, open‑source OLAP database developed by Yandex that offers high‑compression columnar storage, vectorized execution, and massive read/write throughput, making it ideal for large‑scale analytics while having specific usage scenarios and notable limitations such as lack of true transactions and secondary indexes.

Big Data Technology Architecture

May 31, 2020

ClickHouse is a column‑oriented database management system (DBMS) designed for online analytical processing (OLAP). It was open‑sourced by Yandex in Russia and is now used by major Chinese companies such as Tencent, ByteDance, Ctrip, and Kuaishou, with clusters scaling to thousands of nodes; Alibaba Cloud even offers ClickHouse as a cloud service.

ClickHouse Features

The system was built from OLAP requirements and implements a custom high‑efficiency columnar storage engine. Columnar storage means data is stored and scanned by column, resulting in reduced I/O, higher compression ratios, and suitability for analytical workloads.

Columnar Storage : Data is stored per column, enabling smaller I/O and better compression, as illustrated by the accompanying diagram.

Speed : ClickHouse achieves very fast query performance by combining columnar storage, efficient compression, and a vectorized execution engine that fully utilizes CPU resources. It can process billions of rows per second on a single server and also provides high‑throughput writes, making it suitable for massive data updates.

Performance benchmarks from the community demonstrate its superiority in single‑table queries compared to other engines, though multi‑table joins may perform less well.

Rich Functionality

Beyond speed, ClickHouse supports most SQL syntax (with some limitations), real‑time data updates, and excellent scalability—from single‑node deployments to distributed clusters with hundreds or thousands of nodes, each capable of storing trillions of rows or over 100 TB of data.

Additional features include primary‑key indexes, sparse indexes, data sharding, partitioning, TTL, and master‑slave replication.

Application Scenarios and Constraints

Typical use cases include read‑heavy workloads, bulk updates of more than 1,000 rows, append‑only data ingestion, queries that retrieve many rows but only a few columns, very wide tables, low query frequency, sub‑50 ms latency for simple queries, small‑value columns, and scenarios requiring high per‑query throughput without transactional guarantees.

1. Most requests are read‑only 2. Data is updated in large batches rather than single rows 3. Data is primarily appended, not modified 4. Queries read many rows but only a few columns 5. Tables are wide (many columns) 6. Query frequency is relatively low 7. Simple queries tolerate ~50 ms latency 8. Column values are small numbers or short strings 9. High throughput per query (up to billions of rows per second) 10. No need for transactions 11. Low consistency requirements 12. Queries typically involve a single large table plus small auxiliary tables 13. Result set is much smaller than the source (due to filtering/aggregation)

Corresponding limitations of ClickHouse include lack of true delete/update support, no built‑in transactions (future versions may add them), no secondary indexes, limited SQL (especially joins), no window functions, and manual metadata management.

References

Official documentation: https://clickhouse.tech/docs/en/

Additional articles: https://zhuanlan.zhihu.com/p/98135840 , https://zhuanlan.zhihu.com/p/22165241 , https://zhuanlan.zhihu.com/p/71014268

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance ClickHouse OLAP data analytics Columnar Database

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.