Big Data 14 min read

Tencent Game Big Data Analysis Engine: Architecture, Practices, and Future Directions

This article presents the design, implementation, and operational experience of Tencent's game big‑data analysis platform, covering its background, the offline, online, and real‑time multi‑dimensional analysis engines, practical use cases, performance optimizations, and future roadmap.

DataFunSummit

Dec 13, 2021

Tencent Game Big Data Analysis Engine: Architecture, Practices, and Future Directions

Introduction – The article introduces iData, Tencent's game big‑data analysis system, which combines iDataCharts for visualization and iDataEngine for analysis to overcome limitations of traditional BI tools and databases.

1. Tencent Game Big‑Data Analysis Background

Rapid growth of Tencent games with over 110 PC titles (e.g., League of Legends, DNF) and 390 mobile games (e.g., Honor of Kings) creates a complex environment.

Daily data volume exceeds 300 TB, with more than 20 billion records, ~1 300 tables, and 430+ dimensions per business.

Each game has its own intricate data model, requiring fine‑grained operations and rapid analytics.

These challenges motivate the use of big‑data techniques for precise, efficient product operation.

2. Architecture Overview

The platform is layered from bottom to top:

Data lake: cloud storage, relational databases (MySQL, PostgreSQL), Hadoop, and office storage.

Service engine: visualization engine, multi‑dimensional & real‑time analysis engine, and AI research engine.

Capability model: functional abstractions built on the engines.

Analysis methods & decision support: user‑facing applications.

The focus of this article is the data analysis engine.

3. Big‑Data Analysis Engine Components

3.1 Offline Multi‑Dimensional Analysis – TGMars

Pre‑processing + single‑shard storage & compute binding to avoid shuffle.

Bitmap indexes accelerate hot‑spot calculations.

Materialized views (monthly/annual) reduce full‑scan time.

Deeply customized Spark‑SQL via DataSourceV2 for push‑down filtering.

3.2 Online Profile Analysis – TGFace

Handles massive user‑profile queries (e.g., 50 M new users) using columnar storage and dynamic bitmap indexes.

Workflow: TGMars extracts raw user packs → scheduler → Datanode columnar storage → SQL parser → optimizer → JIT‑compiled DAG execution.

Performance: 1 × 10⁸ records, 6 dimensions → 1.25 s for drill‑down; 10 dimensions → ~3.4 s for pivot.

3.3 Real‑Time Multi‑Dimensional Analysis – TGDruid

Real‑time logs from game servers are ingested via Kafka/Pulsar, processed by Storm/Flink ETL, and fed into Druid.

Druid runs in‑memory; only recent (≤2 days) segments are kept in memory, while older data is persisted to MySQL for reporting.

Configuration‑driven ETL enables task launch within ~5 minutes without code changes.

Optimizations include time‑based partitioning, dimension validation, automatic failure detection, and Prophet‑based real‑time forecasting.

4. Application Scenarios

Typical use cases include:

User segmentation based on activity, payment, or in‑game metrics.

Tracking and profiling churned users to understand behavior before loss.

Real‑time monitoring of key indicators (DAU, revenue, match counts) for newly launched games or events.

Targeted marketing campaigns linked with custom metrics.

5. Summary and Future Plans

The roadmap aims to further ecosystem‑ize the three engines, open‑source collaboration, scientific‑driven analysis, and predictive decision‑making, while expanding the data science lab with Jupyter‑based experiments.

Thank you for reading.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

data engineering Real-time Processing Tencent Game Analytics

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.