ClickHouse in Self‑Service Analytics: Architecture, Optimization Practices and Future Roadmap at ZuanZuan Platform
This article details how ZuanZuan leveraged ClickHouse as the core OLAP engine for its massive self‑service analytics platform, covering OLAP engine selection criteria, system architecture, real‑world use cases, performance tuning, operational challenges, and future development plans.
ZuanZuan processes massive user behavior (event) data for reporting and analytics, requiring instant, flexible, and high‑performance query capabilities; traditional pre‑aggregated warehouses cannot meet the dynamic, personalized reporting needs, prompting the adoption of an ad‑hoc query engine.
The OLAP engine selection considered three dimensions: performance (data volume and latency), flexibility (support for both aggregated and detailed queries, real‑time ingestion, high concurrency), and complexity (deployment simplicity, low operational overhead, strong scalability). Open‑source candidates evaluated included MOLAP solutions (Kylin, Druid) and MPP‑based ROLAP engines (Impala, Presto). ClickHouse was chosen for its columnar storage, vectorized execution, millisecond‑level query latency, and strong single‑node performance, despite lacking full transaction support.
The Gauss platform, built on ClickHouse, consists of four layers: data collection (MySQL via Flink‑CDC, log streams via Flume to Kafka/HDFS), storage (Kafka, HDFS, ClickHouse wide tables with offline processing via SeaTunnel and online ingestion via Flink ClickHouseSink), service (HTTP API and SQL client), and application (self‑service analytics and user‑profile products). The architecture employs replicated and distributed table engines managed by ZooKeeper.
Typical workloads include interactive report queries, user‑profile generation, A/B testing, and real‑time monitoring, each benefiting from ClickHouse’s fast columnar scans and built‑in functions. Specific optimization practices cover memory tuning (adjusting max_concurrent_queries, max_memory_usage, external sort/group‑by thresholds), ZooKeeper configuration, and parameter tuning for merges and background pools.
Operational pain points identified are limited high‑concurrency capacity, lack of transactional DDL, absence of row‑level updates/deletes, and missing automatic rebalancing. Future directions involve platform‑level fault isolation, containerized deployment for elastic scaling, intelligent routing between ClickHouse and Doris for high‑concurrency scenarios, and core engine enhancements such as distributed transaction support and removal of ZooKeeper dependency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
