Big Data 8 min read

How Hypergryph Built a High‑Performance Real‑Time Analytics Platform with StarRocks

This case study details how Hypergryph leveraged Alibaba Cloud EMR Serverless StarRocks, Flink, and Kafka to replace a ClickHouse data warehouse with a high‑performance, elastic, and easy‑to‑operate real‑time analytics platform that dramatically improved query speed, stability, operational efficiency, and cost for their gaming business.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How Hypergryph Built a High‑Performance Real‑Time Analytics Platform with StarRocks

Customer Background and Business Challenges

Hypergryph (HYPERGRYPH) , founded in 2017 in Shanghai, is a leading Chinese game development and publishing company known for titles such as Arknights, Bubble Bubble, From the Stars, and End of the World. The company relies on Alibaba Cloud to build a data platform supporting game operations, community ecosystems, and user behavior analysis, with growing data volume and real‑time requirements.

Business Features

Real‑time business : log analysis, ad attribution, community operations, audit, etc.

Account log analysis – queries on user registration, login, order details.

Ad attribution – trace new and returning users to specific ads.

Community operations – real‑time flow, new active user statistics.

Audit – black‑market activity, account bans analysis.

Existing Architecture Pain Points

High operational complexity : dynamic scaling is cumbersome, cluster stability is heavily affected by load fluctuations.

Insufficient ingestion performance : high‑frequency real‑time writes are throughput‑limited, unable to meet high QPS demands.

Data consistency risk : distributed tables lack transaction guarantees, queries may return inconsistent results due to node latency.

Limited compute model : Scatter‑Gather architecture does not support complex queries such as Shuffle Join.

Poor metadata stability : ZooKeeper‑maintained metadata easily causes service jitter under heavy load.

Technical Solution Design

Solution Goals

Build a high‑performance, highly elastic, easy‑to‑operate real‑time analysis platform that satisfies:

Real‑time : millisecond query response and second‑level data ingestion.

Elastic scaling : dynamically adapt to traffic spikes such as game launches and event peaks.

Stability : eliminate cluster load jitter and data consistency risks.

Compatibility : seamless integration with existing toolchains and development habits.

Architecture Design

Overall Architecture Diagram

Real‑time Data Warehouse Architecture

OLTP source : MySQL and other business databases generate row‑level changes.

Extract (CDC) : Debezium/Kafka‑Connect captures binlog, converts it to event streams, and writes to Kafka for buffering and decoupling.

Transform (real‑time compute) : Flink reads from Kafka, performs cleaning, joins, aggregations, and outputs fact/dimension streams.

Load & Query (StarRocks) : StarRocks provides columnar storage and high‑concurrency OLAP queries, directly serving BI, reports, auto‑query tools, and APIs.

Migration Results and Value

Technical Impact

Performance boost : core query latency reduced by over 30%; complex ad‑attribution analysis shortened from minutes to seconds; peak QPS capacity increased fivefold, supporting million‑level concurrent requests.

Stability improvement : cluster load jitter reduced by 40%; 99.99% SLA achieved, MTTR shortened to minutes.

Operational efficiency : seamless scaling with 100% success during server launches; automated monitoring cuts manual interventions by 70%.

Financial Impact

Cost optimization : hardware cost 22% lower than ClickHouse under the same load; serverless pay‑as‑you‑go model avoids idle resource waste.

Business Impact

Development productivity : MySQL‑compatible protocol reduces integration effort; built‑in function library covers 90% of scenarios, UDF development speed up 50%.

Ecosystem adaptability : active open‑source community shortens bug‑fix and feature cycles compared with ClickHouse.

Future Plans

Compute‑storage separation architecture : explore migration to separate compute‑storage instances to lower hot/cold data storage costs.

Enhanced permission management : introduce Ranger for fine‑grained access control to meet compliance requirements.

Multimodal analysis : gradually migrate existing Elasticsearch workloads to integrate full‑text search with structured queries.

Conclusion

Through deep collaboration with Alibaba Cloud EMR Serverless StarRocks, Hypergryph successfully built a high‑performance, elastic, and easy‑to‑operate real‑time analytics platform that underpins fine‑grained game operations and user insights. Ongoing technical exploration will continue to unlock real‑time data value and keep the company competitive in the gaming industry.

Reference: https://help.aliyun.com/zh/emr/emr-serverless-starrocks/user-guide/migrate-data-from-clickhouse-to-serverless-starrocks https://help.aliyun.com/zh/emr/emr-serverless-starrocks/use-cases/data-warehouse-solution-near-real-time-analysis-of-data-at-the-minute-level https://help.aliyun.com/zh/emr/emr-serverless-starrocks/getting-started/use-compute-storage-separation-instances

cloud computingdata pipelineFlinkStarRocksKafka
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.