How 37 Mobile Games Boosted Analytics with StarRocks: A Real‑World Performance Case Study
37 Mobile Games, a leading mobile game publisher, migrated its user‑profile analytics from a Hadoop‑Hudi‑Kafka‑Hive‑Flink stack to StarRocks, achieving sub‑second query latency on billion‑row tables, simplifying operations, reducing storage costs, and enabling real‑time data sync, as detailed in this technical case study.
Background and Technical Pain Points
37 Mobile Games operates several popular titles and relies on a data‑driven pipeline for game operation analytics. The main technical challenges were:
Complex user‑dimension queries (e.g., LTV) caused long‑running jobs and slow response times.
Real‑time analytics were limited because data cleaning introduced noticeable latency.
High operational cost due to a large number of big‑data components.
Need for second‑level synchronization from MySQL to an OLAP store while guaranteeing consistency and low latency.
Multi‑billion‑row dimension‑table joins performed poorly.
Linear cost growth as data volume increased.
Legacy Data Architecture
The original stack followed a classic real‑time/offline model:
Apache Hadoop – batch processing.
Apache Hudi – data lake storage.
Apache Kafka – messaging layer.
Apache Hive – batch query engine.
Apache Flink – stream processing.
User‑profile data were eventually written to Elasticsearch for low‑latency lookups, with historical data pushed through Kafka to alleviate write pressure. This approach failed to reduce load significantly and Elasticsearch could not satisfy complex join and aggregation queries.
OLAP Engine Evaluation
Apache Kylin : Pre‑aggregation gave fast lookups but required heavy compute and storage, and its refresh latency did not meet the near‑real‑time requirements.
ClickHouse : Excellent single‑table performance, but scaling to a cluster introduced operational complexity, stability issues, and data‑consistency concerns.
StarRocks : MySQL‑compatible protocol, standard SQL support, simple deployment/monitoring, strong large‑table join performance, and efficient real‑time ingestion.
StarRocks Production Cluster
After benchmark testing, StarRocks satisfied the performance thresholds. The production cluster consists of:
3 Frontend (FE) nodes – each 8 CPU / 32 GB RAM.
5 Backend (BE) nodes – each 32 CPU / 128 GB RAM.
New User‑Profile Architecture with StarRocks
The migration introduced FlinkCDC to stream changes from MySQL directly into StarRocks, enabling real‑time synchronization and richer data associations.
Read/Write Performance : Broker load can ingest billions of Hive rows in ~120 minutes, matching the previous Elasticsearch throughput while supporting multiple import methods.
Federated Queries : External tables allow StarRocks to join Hive and other heterogeneous sources without data movement.
Bitmap‑Based User‑Profile Tables : Complex audience segmentation queries now finish in seconds instead of minutes.
Bitmap Table Design and Tuning
Vertical detail tables from Hive are pushed to StarRocks and aggregated into bitmap tables. An initial primary‑key order of (uid, tag_type) caused excessive I/O because bitmap construction filtered on tag_type. Reordering the key to (tag_type, uid) reduced I/O to <20 % of the original load.
Performance Comparison
Benchmark on a fact table (>200 M rows) with dimension tables (>40 M rows):
ClickHouse join query time ≈ 50 s.
StarRocks join query time < 1 s (join‑push‑down optimization).
Bitmap‑based audience selection in StarRocks ~26× faster than Apache Impala.
Future Outlook
Additional business units plan to replace their existing OLAP engines with StarRocks, extending use cases and improving cluster stability. Planned technical enhancements include:
Reducing memory consumption of primary‑key models.
Supporting more flexible partial‑column updates.
Further bitmap query optimizations.
Improved multi‑tenant resource isolation.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
