Cloud Native 21 min read

How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Xiaohongshu built REDck, a cloud‑native, storage‑compute separated real‑time OLAP warehouse on ClickHouse, addressing scaling, cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, bucketing, and exactly‑once transaction support.

ITPUB
ITPUB
ITPUB
How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Background

ClickHouse is a high‑performance OLAP database used for ad‑tech, community, live streaming and e‑commerce workloads. Its native shared‑nothing MPP architecture provides sub‑second query latency but suffers from high operational overhead, limited elastic scaling, and fragile fault‑tolerance.

Challenges of the Original ClickHouse Deployment

Elastic scaling difficulty : Compute and storage are tightly coupled; adding nodes requires manual data rebalancing and weeks of migration.

Low resource utilization : Multi‑replica storage inflates CPU and storage costs; compute capacity often exceeds storage needs.

Stability issues : Zookeeper‑based coordination becomes a single point of failure at large scale; query latency spikes under load.

Lack of distributed transactions : No exactly‑once guarantees for data ingestion pipelines, leading to duplicate or inconsistent data.

REDck Architecture Overview

REDck (Real‑time Elastic Data warehouse on ClickHouse) is a cloud‑native redesign that separates compute from storage, introduces a stateless unified metadata service, and uses object storage as the primary data lake.

REDck overall design diagram
REDck overall design diagram

Unified Metadata Service (Metastore)

Metadata is centralized in a stateless Metastore. Internal metadata is stored in MySQL (transactional, consistent) while external catalogs such as Hive or Iceberg can be integrated. Compute nodes retrieve up‑to‑date schema and partition information from the Metastore, eliminating per‑node local metadata and Zookeeper coordination.

Metastore interaction diagram
Metastore interaction diagram

Object‑Storage Access Optimizations

Data resides in cloud object storage (e.g., S3, OSS) which offers virtually unlimited capacity but higher latency and lower single‑thread throughput. REDck mitigates these drawbacks through:

Multi‑level caching : In‑memory → local‑disk → distributed cache. Cached reads can be up to 100× faster; parallel downloads achieve ~10× speedup for uncached data.

Query‑plan reordering : Parts are read in parallel and HTTP round‑trips are minimized by grouping mark ranges per connection.

Robust access module : Timeout detection, retry logic, and data‑integrity checks improve stability.

Object‑storage optimization flow
Object‑storage optimization flow

Multi‑Level Caching Strategy

REDck provides two caching policies:

Passive cache : Data is cached on‑demand during query execution.

Active cache : Hot data is pre‑loaded based on user‑defined rules and query history. Eviction uses LRU or Clock‑Sweep.

Multi‑level cache architecture
Multi‑level cache architecture

Distributed Task Scheduling

A global Master role elects a single Server to coordinate cluster‑wide tasks (compaction, mutation, inserts, cache refresh). Scheduling is bucket‑based, automatically adapting to scale‑out or scale‑in events to avoid conflicts.

Distributed task scheduling architecture
Distributed task scheduling architecture

Data Bucketing

Tables can be bucketed on a chosen key (e.g., user_id). A hash function maps rows to a fixed number of buckets, enabling:

Fast point‑lookups using bucket keys.

Reduced shuffle for joins and aggregations.

Bucket‑level task scheduling that supports elastic scaling.

Bucket diagram
Bucket diagram

Exactly‑Once Distributed Transactions

REDck implements a two‑phase commit (2PC) protocol managed by the Metastore. The protocol provides exactly‑once semantics for ingestion pipelines such as Hive→REDck, Spark→REDck, and Flink→REDck (via Flink checkpoint integration). This eliminates duplicate writes and ensures global visibility of committed data.

Two‑phase commit flow
Two‑phase commit flow

Offline Sync Optimizations

Batch ingestion is performed with Spark instead of Flink micro‑batches, simplifying the pipeline, removing compaction‑induced write amplification, and supporting INSERT OVERWRITE semantics to avoid reading partially loaded data.

Offline sync architecture
Offline sync architecture

Performance and Operational Impact

After two years of production, REDck serves >10 business lines with >30 PB of data and clusters reaching tens of thousands of CPU cores. Compared with the original ClickHouse deployment:

CPU efficiency improved ~10× (more data processed per core).

Storage cost per TB reduced ~10× thanks to object‑storage and elimination of multi‑replica overhead.

Query latency remains comparable to native ClickHouse despite object‑storage latency.

Elastic scaling time reduced from weeks to minutes; cluster availability >99.9%.

These gains enable data retention extensions from months to years and support a rapidly growing set of analytical use cases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativecachingClickHouseDistributed TransactionsReal-time OLAPobject storagemetadata service
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.