Tencent Game Data Analysis: Lakehouse Integration Practice
This article presents Tencent Game's comprehensive lakehouse integration practice, detailing the project background, storage‑compute separation, data layering, unified DDL/DML operations, performance optimizations, and future plans, illustrating how StarRocks, Iceberg, and Spark are combined to achieve scalable, cost‑effective analytics for massive game data.
The presentation introduces Tencent Game's data platform evolution from a complex Lambda architecture to a unified lakehouse solution, aiming to simplify architecture, improve flexibility, and reduce storage and compute costs.
Key objectives include merging real‑time and batch processing pipelines, leveraging StarRocks as the core analytical engine, integrating Iceberg tables stored in Tencent Cloud Object Storage (COS), and supporting both Kafka‑based streaming and Spark‑based batch loads.
Compute‑storage separation is achieved by introducing Computer Nodes (CN) that offload computation from BE nodes, enabling elastic scaling via a Kubernetes operator with automatic horizontal pod autoscaling and group‑based isolation for different workloads.
Data layering strategies involve sinking cold data to the data lake using export commands or IcebergTableSink, and unifying queries across multiple StarRocks clusters through shared Iceberg metadata, allowing cross‑cluster joins.
Performance enhancements include Agg push‑down optimization, batch data reading via a fused Spark connector, and extensive metadata caching to reduce I/O, resulting in up to six‑fold query speed improvements.
Real‑world results show a game service handling 2 PB of local data, 3 PB in the lake, ingesting 50 TB daily, serving 20 000 concurrent queries with 2‑second P90 latency, while future plans focus on materialized views, further parsing offloading, and accelerated bulk imports.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.