Introduction to ClickHouse: Features, Installation, Performance Testing, and Comparison
This article introduces ClickHouse, an open‑source column‑oriented OLAP database, detailing its key features, appropriate use cases, installation steps, performance benchmark queries, and how it compares with other columnar storage solutions while highlighting its adoption by major internet companies.
1. Introduction
ClickHouse is an open‑source column‑oriented database developed by Yandex for real‑time analytical workloads, offering 100‑1000× faster processing than traditional row‑based systems.
2. Key Features
Fast: utilizes all hardware, peak query performance >2 TB/s (decompressed).
Fault‑tolerant: asynchronous multi‑host replication, no single point of failure.
Scalable: vertical and horizontal scaling to thousands of nodes.
Easy to use: SQL‑based, no custom APIs required.
Hardware‑efficient: columnar storage reduces I/O by up to 10‑100×.
CPU‑efficient: vectorized execution with SIMD and JIT.
Optimized disk access: minimizes range scans.
Minimizes data transfer: works without specialized high‑performance networks.
3. When to Use / Not Use
Suitable for immutable, well‑structured event or log streams; not suitable for OLTP, high‑value key‑value, blob, or document workloads.
4. Why It Is Fast
OLAP characteristics—read‑heavy, wide tables, batch writes, and columnar storage—allow selective column reads, high compression, vectorized query execution, and reduced disk I/O, all contributing to superior performance.
5. Installation
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
sudo yum install clickhouse-server clickhouse-clientConfiguration directories after yum installation:
/etc/clickhouse-server/ (config files)
/var/lib/clickhouse/ (data files)
/var/log/clickhouse-server/ (log files)These paths can be edited in /etc/clickhouse-server/config.xml and /etc/clickhouse-server/users.xml .
Start and connect:
sudo /etc/init.d/clickhouse-server start
clickhouse-client -m # default login as user 'default'
clickhouse-client --user=xxx --password=xxx --host=xxx6. Performance Test
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue
FROM lineorder_flat
WHERE (C_CITY='UNITED KI1' OR C_CITY='UNITED KI5')
AND (S_CITY='UNITED KI1' OR S_CITY='UNITED KI5')
AND year>=1992 AND year<=1997
GROUP BY C_CITY, S_CITY, year
ORDER BY year ASC, revenue DESC;The query processed 546.67 million rows in 1.723 seconds, achieving ~4.46 GB/s throughput (≈317 million rows/s).
7. Comparison with Other Column Stores
Different storage types suit different scenarios; there is no universal “silver bullet.”
8. Adoption
Major internet companies using ClickHouse include:
Toutiao – thousands of nodes, dozens of PB daily.
Tencent – game data analytics with a dedicated ops system.
Ctrip – >80% of business runs on ClickHouse, billions of rows daily.
Kuaishou – ~10 PB total, 200 TB daily, 90% queries < 3 s.
Yandex, CloudFlare, Spotify, and others worldwide.
9. Conclusion
The article provides a concise overview of ClickHouse’s capabilities, installation steps, and performance characteristics, encouraging readers to consult the official documentation for deeper exploration.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.