Big Data 15 min read

How StarRocks Boosted MaFengWo’s OLAP Performance by 4×

MaFengWo’s data platform replaced Kylin, Presto, and Druid with StarRocks, redesigning its four‑layer architecture, unifying metadata, and optimizing single‑table, multi‑table, and precise‑deduplication queries, which cut query latency by four times, reduced storage by 87%, and lowered operational complexity.

StarRocks

May 19, 2022

Background

MaFengWo’s data platform uses a four‑layer architecture (storage, compute, analysis, application). The analysis layer originally combined multiple engines: Presto for ad‑hoc queries, Apache Kylin for multi‑dimensional and fixed BI reports, Apache Druid for real‑time analysis, and Hive as a fallback. Kylin handled ~80% of daily queries, but growing data volume and query complexity exposed several issues.

Limitations of Apache Kylin

Pre‑computation required; any source data change forces costly cube rebuilds, causing downtime.

Performance instability because Kylin’s HBase storage could not always use prefix indexes, leading to full‑table scans and high latency when RowKey order differed.

High dimensionality caused cube explosion and excessive storage consumption.

Maintaining four separate engines increased operational overhead, required different SQL dialects, and created data‑consistency challenges.

Evaluation and Selection of StarRocks

Several OLAP products were benchmarked. StarRocks offered MySQL‑compatible syntax, strong support for both fixed and flexible queries, and lower maintenance compared with ClickHouse (which struggled with multi‑table joins and scaling). The chosen StarRocks version is 1.19.5.

Migration and Optimization Cases

Single‑Table Aggregation

StarRocks created an aggregation model whose fact‑table schema matches the Kylin cube and aligned the Sort Key with Kylin’s RowKey. After migration, average query latency dropped dramatically and became stable.

Multi‑Table Join

For a transaction model with >60 dimensions, a StarRocks aggregation model was built for the fact table and detailed models for dimension tables. Matching Sort Key to Kylin’s RowKey reduced join latency significantly.

Precise Deduplication for User‑Behavior Analysis

Bitmap columns were used in StarRocks to replace Kylin’s pre‑computed bitmap and Presto’s self‑join approach. The new design achieved sub‑second query times for 7‑day retention analyses.

Integration with the Existing Data Platform

StarRocks metadata was added to the central metadata system that already manages Hive, HBase, and Kafka tables. This enables intelligent routing: the query service selects the optimal engine based on metric definitions.

Data ingestion uses a Hive‑to‑StarRocks broker‑load pipeline. The pipeline supports custom field selection, type conversion, and pre‑processing before loading. Load jobs are scheduled during off‑peak hours to avoid CPU contention with online queries.

Operations, Monitoring, and Auditing

CPU spikes on BE nodes were observed when concurrent broker loads and online queries ran together. Since StarRocks 1.19.5 does not support resource isolation, the team staggers load tasks and online traffic.

Monitoring stack: Prometheus + Grafana tracks core metrics (e.g., BE CPU usage) and triggers alerts when thresholds are exceeded.

Audit logs are collected with FileBeat , stored in StarRocks, and used for slow‑SQL analysis and historical query profiling. Typical audit queries include:

SELECT query_id, exec_time, cpu_time FROM starrocks_audit WHERE exec_time > 5s ORDER BY exec_time DESC;

These logs also feed a SQL audit dashboard that reports P95/P99 latency, error trends, and resource‑heavy statements.

Benefits and Outcomes

Query speed: P95 latency stabilized around 5 seconds, a ~4× improvement over the previous stack.

Operational cost: The simplified stack reduced maintenance effort compared with Kylin’s Hadoop‑heavy ecosystem.

Storage savings: HBase storage decreased from ~400 TB to ~50 TB after migration.

Development efficiency: Modeling no longer requires extensive RowKey configuration or full cube rebuilds for dimension changes.

Future Plans

Isolate resources across tenant clusters and monitor upcoming StarRocks releases that add resource‑isolation features.

Explore external tables to simplify data ingestion and reduce ETL pipeline complexity.

Migrate additional Presto‑based ad‑hoc workloads to StarRocks for better interactive performance.

Extend StarRocks to cover real‑time scenarios currently handled by Druid, aiming for a unified OLAP engine.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance bigdata Kylin data-warehouse

Written by

StarRocks

StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.