How iQIYI Boosted Ad Query Performance 400% with StarRocks – A Deep Dive into OLAP Evolution
This article details iQIYI's transition from Impala+Kudu and ClickHouse to StarRocks, describing the OLAP architecture, performance gains of up to 400% in advertising workloads, the technical challenges of data consistency, lake‑warehouse fusion, operational scaling, and the step‑by‑step migration process using a dual‑run platform.
iQIYI OLAP System Overview
iQIYI runs a large‑scale data analysis platform that supports historical reporting (dashboards, show popularity, member operations) and future forecasting (user growth, revenue estimation). The platform uses a multi‑layer OLAP stack.
Architecture
Storage layer: Hive for offline data, Kafka for real‑time streams, Iceberg for near‑real‑time storage. Query layer: multiple engines (Impala, Kudu, Druid, ClickHouse, Spark, Trino) accessed through an in‑house intelligent SQL gateway that abstracts engine details.
Engine evolution
Traditional warehouse (compute‑storage integrated) : high concurrency, low latency. Progression: MySQL/Elasticsearch → Impala+Kudu → Druid (time‑series) → ClickHouse (extreme performance).
Lakehouse (compute‑storage separated) : ad‑hoc analysis with emphasis on scale and cost. Progression: Hive → Spark → Trino → StarRocks (replaces ClickHouse and Druid, supports detailed queries on massive data).
StarRocks vs. ClickHouse
Advertising workloads: replacing Impala+Kudu with StarRocks increased interface throughput by 400 % and reduced P90 latency by 4.6× . In the “Magic Mirror” analytics platform, StarRocks achieved 33× P50 and 15× P90 speedup over Spark, saving roughly 4.6 person‑days per day of analysis.
Data consistency
StarRocks’ Flink connector provides exactly‑once semantics, avoiding duplicate records that can occur with ClickHouse’s at‑least‑once connector. StarRocks also supports atomic partition‑level replacement for offline‑over‑online data refreshes.
Lake‑warehouse fusion
StarRocks can query Hive external tables with performance comparable to native StarRocks tables, sometimes better on large‑scale queries. This enables a simple HA strategy: a failed StarRocks cluster can be replaced by launching an elastic compute cluster that directly accesses Hive.
Operational complexity – scaling
ClickHouse requires manual rebalancing after node addition and cannot shrink clusters easily. StarRocks provides automatic, seamless scaling and rebalancing, automatically replicating data when nodes fail or are removed.
Migration workflow (Pilot dual‑run platform)
Select SQL statements for dual execution (filter or manual selection).
Configure experiment: define a control group (Spark) and a test group (StarRocks) and generate sub‑tasks.
Rewrite writes to temporary tables, execute both groups, and compare results row‑by‑row.
Generate report: promote SQLs that pass verification; investigate and tune those that fail.
Production results
Advertising interfaces: 400 % performance boost and P90 latency improved 4.6×.
All switched workloads: average query speed increased 33× (P50) and 15× (P90).
Data‑consistency issues reduced from 19 % to 0 % after fixing 13 cases.
StarRocks adoption reached 50 % with a 67 % success rate among switched workloads.
Future plans
Upgrade to StarRocks 3.3, evaluate materialized‑view acceleration, and introduce tiered storage (archiving older data to HDFS).
Replace ClickHouse and Druid within the next year to lower operational overhead.
Expand dual‑run rollout, close remaining UDF compatibility gaps, and continue performance improvements via materialized views and cache optimizations.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
