Building a Real-Time OLAP Analytics Platform for QQ Music with ClickHouse and Tencent Cloud EMR
QQ Music’s data team tackled massive PB‑scale, real‑time analytics challenges by migrating from Hive to a ClickHouse‑based OLAP platform integrated with Tencent Cloud EMR and Superset, achieving low‑latency, high‑availability data processing, self‑service visualization, and efficient read/write scaling for billions of daily events.
QQ Music, a leading music streaming service with over 800 million registered users and a catalog of more than 30 million songs, generates petabyte‑scale daily data that requires fast, flexible analysis.
Traditional offline warehouses built on Hive suffered from low timeliness, poor usability, and lengthy processing cycles, making it difficult to support real‑time dashboards, instant queries, and rapid decision‑making.
To address these issues, the QQ Music big‑data team partnered with Tencent Cloud EMR to design a high‑availability, low‑latency OLAP platform based on ClickHouse and Superset, leveraging EMR’s elastic Hadoop services.
ClickHouse Overview – an open‑source, column‑oriented OLAP database from Yandex that delivers millisecond‑level query performance on PB‑scale data.
Key technical challenges and solutions:
SSD‑based ZooKeeper improves metadata I/O, reducing replica synchronization delays.
Global load‑balancing ensures idempotent writes, preserving data consistency after retries.
Tube message queue enables efficient ingestion of both real‑time streams and offline batch results.
Partition strategy limits table partitions to under 10,000, converting hourly partitions to daily to avoid file‑descriptor exhaustion.
Read/write separation using temporary ClickHouse nodes on Kubernetes offloads heavy write workloads, allowing the main cluster to serve fast reads.
Local‑hash based cross‑table joins avoid costly Global IN/Join operations, improving query speed.
Superset, an Apache‑backed BI tool, was integrated for self‑service data visualization, allowing non‑technical staff to create thousands of dashboards and achieve “全民数据分析”.
The combined ClickHouse + Superset solution on Tencent Cloud EMR provides rapid cluster provisioning, elastic scaling, automated operations, and 24/7 professional support, delivering a production‑ready, cloud‑native big‑data analytics stack.
Overall, the cloud‑based OLAP infrastructure has enabled QQ Music to perform real‑time analysis on PB‑level data with sub‑second latency, supporting a wide range of business scenarios while reducing operational complexity and cost.
Big Data Technology Architecture
Exploring Open Source Big Data and AI Technologies
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
