WeChat's ClickHouse Real‑Time Data Warehouse: Challenges, Co‑Construction, and Performance Gains
Facing Hadoop’s minute‑to‑hour query latency on petabyte‑scale data, WeChat partnered with Tencent Cloud to build a ClickHouse‑based real‑time warehouse, adding custom ingestion, query‑optimisation and management tools that deliver billion‑row throughput, sub‑5‑second queries and over ten‑fold performance gains across millions of daily queries.
WeChat, as a national‑level application covering social, payment, travel and many other scenarios, faces massive and diverse data‑analysis demands. Traditional Hadoop‑based data warehouses could not meet the requirements for fast, interactive queries on petabyte‑scale data.
1. Challenges faced by WeChat
Key analysis scenarios include scientific exploration, dashboards for operations, and A/B testing platforms. All require low latency, high throughput, and the ability to handle "trillions of rows per day". The Hadoop stack suffered from slow response (minutes to hours), cumbersome metric development, and an overly complex multi‑layer architecture that became difficult to maintain at scale.
2. Selecting ClickHouse
After evaluating many OLAP solutions, ClickHouse was chosen for two main reasons: (1) efficiency – real‑world tests showed >10× speedup over Hadoop; (2) open‑source nature – allowing deep kernel customisation for WeChat’s specialised needs.
3. Joint construction with Tencent Cloud
Tencent Cloud’s ClickHouse team provided a fully managed, one‑stop service, relieving WeChat of stability concerns and sharing extensive query‑optimisation expertise. Together they built a "batch‑stream integrated" warehouse centred on ClickHouse.
4. ClickHouse ecosystem built by the collaboration
QueryServer – intelligent gateway with caching, large‑query interception and rate‑limiting.
Sinker – high‑performance ingestion layer handling back‑pressure, hash routing, priority and write throttling.
OP‑Manager – cluster management, data balancing, disaster recovery and migration.
Monitor – health monitoring, alerting and query‑performance analysis, integrated with OP‑Manager.
5. Key technical achievements
High‑performance ingestion achieving billion‑level throughput with token‑based flow control and precise‑once guarantees.
Extreme query optimisation: custom syntax and internal tuning reduced P95 latency to <5 seconds on tables with trillions of rows, and <3 seconds for A/B experiments, delivering >50× speedup.
Real‑time feature computation with scan volumes of billions and end‑to‑end latency <3 seconds (P95 ≈ 1 second).
6. Performance impact
WeChat now operates a thousand‑node ClickHouse cluster handling petabyte‑scale data, with daily query volumes in the millions and cluster TPS reaching hundreds of millions. Average query latency is measured in seconds, representing a >10× improvement over the previous Hadoop solution.
7. Towards a cloud‑native, storage‑compute separated warehouse
The next goal is a cloud‑native data warehouse with elastic scaling, zero Zookeeper bottlenecks, seamless read/write separation, and simplified operations. Features include second‑level elastic expansion, high availability, and full‑stack query optimisation.
Overall, the WeChat‑Tencent Cloud co‑construction demonstrates how an open‑source OLAP engine can be industrialised at massive scale, delivering fast, reliable analytics for a variety of business scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
