Databases 18 min read

Design and Advantages of a Cloud‑Native ClickHouse OLAP System

This article presents the architecture, key features, and operational benefits of a cloud‑native ClickHouse OLAP platform, describing how storage‑compute separation, a unified master node, and shared storage reduce cost, improve availability, and simplify management while remaining fully compatible with the open‑source ClickHouse ecosystem.

Tencent Architect

Sep 10, 2021

Design and Advantages of a Cloud‑Native ClickHouse OLAP System

The document introduces a cloud‑native ClickHouse solution that builds on ClickHouse’s high‑performance OLAP engine and incorporates design ideas from Snowflake to provide a one‑stop data‑analysis platform for multiple scenarios.

Key advantages include simplicity and easy maintenance through unified cluster management and shared distributed task scheduling, high availability and scalability supporting over five million tables, cost reduction of at least 50% in storage, and full compatibility with ClickHouse protocols, syntax, and storage formats.

Current ClickHouse suffers from usability, stability, maintainability, and feature gaps, such as the need for users to understand local and distributed tables, heavy reliance on Zookeeper causing bottlenecks, and lack of a true MPP query layer.

The proposed architecture adopts a three‑layer design:

Cluster Management Layer : a brain that provides metadata management and a shared distributed task scheduler based on a consistency protocol.

Compute Layer : multiple compute clusters where user queries run; clusters share the management layer.

Storage Layer : shared storage accessible by all compute clusters, offering cheap, on‑demand, unlimited capacity.

Data flow connects directly to ClickHouse nodes, bypassing a master node to avoid central bottlenecks, while control flow is coordinated by a lightweight master node that handles DDL tasks, schema storage, node join/leave, and provides high availability through multi‑replica consensus.

Storage‑compute separation enables strong consistency, multi‑read/write capability, and eliminates Zookeeper as a single point of failure. Shared storage stores parts, and a commit log records part changes, providing conflict handling, replay, and snapshot mechanisms.

Benefits of this design include:

At least 50% reduction in storage cost due to shared physical storage among replicas.

Elimination of dedicated Zookeeper clusters, saving resources.

Higher resource utilization with no read‑only replica waste.

Improved fault tolerance: any replica can take over reads and writes instantly.

Operationally, cluster scaling becomes seconds‑level: new nodes fetch schema from the master and part metadata from shared storage, while removed nodes can be shut down without data migration.

Compatibility with the open‑source ClickHouse ecosystem is preserved; only minimal, non‑intrusive changes are made, allowing seamless upgrades to upstream ClickHouse releases.

Future work includes adding an MPP query engine with distributed joins and aggregations, and removing shard concepts to provide a fully abstracted distributed system.

The article concludes with a recruitment call for engineers interested in high‑performance OLAP system development.

Example command to add a backend node via the master node:

ALTER CLUSTER cluster_name ADD BACKEND 'ip:port' TO SHARD 2;

Example query to list clusters: SELECT * FROM SYSTEM.CLUSTERS; Example table creation using the new architecture:

CREATE TABLE t1 (
    partition_col_1 String,
    tc1 int,
    tc2 int
) ENGINE=MergeTree()
PARTITION BY partition_col_1
ORDER BY tc1;

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems clickhouse Database Architecture OLAP Storage Compute Separation

Written by

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.