Big Data 28 min read

How ClickHouse Powers YiBei's Scalable Advertising Data Platform

This article details YiBei's advertising data platform built on ClickHouse, covering business requirements, why ClickHouse was chosen over Druid, storage engine and compression choices, real‑time and offline ingestion pipelines, partitioning, Zookeeper bottlenecks, atomic data replacement, and testing and release strategies for a high‑throughput, low‑latency ad analytics system.

dbaplus Community
dbaplus Community
dbaplus Community
How ClickHouse Powers YiBei's Scalable Advertising Data Platform

Business Scenario Overview

YiBei operates a first‑party seller advertising platform where merchants pay per click or per sale to boost product visibility and revenue. Sellers need real‑time reports on click‑through rates, sales conversion, and other metrics across multiple ad placements such as home page, search, product pages, and mobile interfaces.

YiBei advertising system overview
YiBei advertising system overview

Why ClickHouse?

Compared with Druid, ClickHouse offers superior columnar compression, partition‑key optimization, full SQL support (joins, sub‑queries, primary‑key ordering), and a simpler MPP architecture that better utilizes CPU resources. Operationally, ClickHouse requires fewer node types and less complex component management, reducing maintenance overhead.

Benchmark tests show ClickHouse ingesting billions of rows in roughly 50 minutes per day, while Druid can take several hours depending on cluster load.

ClickHouse vs Druid performance comparison
ClickHouse vs Druid performance comparison

Table Engine and Primary‑Key Design

YiBei uses ClickHouse's ReplicatedMergeTree engine to handle massive user‑behavior streams while providing high availability through replication. Data is pre‑aggregated on insert, reducing storage and query load.

Primary keys are kept minimal to fit in memory; non‑essential dimensions are removed from the key to improve query speed.

ClickHouse storage engine selection
ClickHouse storage engine selection

Compression and LowCardinality Optimizations

ClickHouse's default LZ4 compression is complemented by higher‑ratio algorithms such as ZSTD and LZ4HC for different column types. In production, LZ4HC level 6 saves ~30 % storage but reduces ingestion speed by ~60 %.

LowCardinality encoding on low‑cardinality columns (e.g., enums) cuts storage by up to 75 % in extreme tests.

Compression algorithm comparison
Compression algorithm comparison

Real‑Time Data Ingestion

Kafka topics are sharded per seller ID; each topic partition exceeds the number of ClickHouse replicas to allow parallel consumption. The pipeline writes directly from Kafka to local ClickHouse tables, achieving million‑level rows per second with sub‑second latency.

Real‑time data ingestion architecture
Real‑time data ingestion architecture

Offline Data Ingestion and Replacement

Batch jobs run on Spark via a LIVY server, generating pre‑partitioned files on HDFS. ClickHouse replica nodes pull these files, load them into temporary tables, and then replace whole partitions using the ATTACH / DETACH API, ensuring the old data remains invisible until the swap completes.

Offline data ingestion architecture
Offline data ingestion architecture

Partition Data Replacement and Atomicity

Version columns (A, B) and an active flag stored in a ClickHouse dictionary track the two most recent data versions. Swapping versions is performed by atomically updating the active column, providing instant visibility changes and rollback capability.

Data versioning and atomic replacement
Data versioning and atomic replacement

Zookeeper Pressure and Mitigation

Both real‑time and offline pipelines rely on Zookeeper for coordination, leading to thousands of client connections and session timeouts under heavy load. The solution involved reducing unnecessary global operations, splitting Zookeeper clusters by shard, and tuning operationTimeout to alleviate contention.

Zookeeper pressure issue diagram
Zookeeper pressure issue diagram

Data Quality Assurance

Quality checks occur at multiple stages: Spark output validation, ClickHouse client import verification, and a dedicated monitoring platform that alerts on anomalies, enabling rapid rollback of problematic versions.

Data quality control workflow
Data quality control workflow

Query Architecture

External APIs expose a unified data access layer. Queries are routed to ClickHouse for most workloads; high‑QPS or heavy‑aggregation requests are delegated to auxiliary stores via the query service layer.

Overall query architecture
Overall query architecture

Testing and Release Process

Major version upgrades employ a dual‑source strategy: traffic is gradually shifted from the old data source to the new one while mirrored queries validate consistency. This approach ensures safe rollouts without service disruption.

Testing and release workflow
Testing and release workflow
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Real-TimeAdvertisingStorage OptimizationLambda architecture
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.