Databases 23 min read

How Cisco Migrated from Pinot to StarRocks and Boosted Query Performance by Up to 70%

This article details Cisco Webex's migration from a complex Pinot‑Trino OLAP stack to StarRocks, covering the challenges of the legacy system, the step‑by‑step migration process—including storage, compute, and SQL dialect transformation—and the resulting performance gains, cost reductions, and operational improvements.

StarRocks
StarRocks
StarRocks
How Cisco Migrated from Pinot to StarRocks and Boosted Query Performance by Up to 70%

Background and Motivation

Cisco Webex relied on an intricate OLAP stack built around Apache Pinot for low‑latency real‑time queries and Trino for complex joins and sub‑queries. The stack suffered from high maintenance costs, limited functionality (no multi‑table joins, sub‑queries, or materialized views), poor data freshness, and fragmented user experience.

Key Challenges of the Existing Stack

High operational overhead and complex monitoring due to dual engines (Pinot + Trino).

Pinot lacked support for joins, sub‑queries, and materialized views, forcing reliance on Trino.

Data back‑fill was difficult because Pinot did not support partitioning.

Limited DML capabilities (no INSERT/UPDATE/DELETE) made data correction cumbersome.

Inconsistent resource isolation and tenant management.

Migration Goals

Achieve superior query performance and support for complex SQL (joins, sub‑queries, materialized views).

Provide robust handling of semi‑structured data (Flat JSON, Variant).

Reduce storage costs and improve disk utilization.

Unify query experience across teams.

Enable automatic scaling and fine‑grained resource isolation.

Migration Path and Practices

The team adopted a two‑pronged approach: moving to a store‑compute‑separated architecture and, where needed, a store‑compute‑integrated deployment, both powered by StarRocks.

Store‑Compute Separation

StarRocks was deployed on Kubernetes with Horizontal Pod Autoscaler (HPA) monitoring CPU and memory, allowing dynamic scaling of compute pods. Resource isolation was achieved using Rack labels to group nodes per business unit, ensuring that heavy workloads in one service did not impact others.

Store‑Compute Integration

For latency‑critical workloads, StarRocks' native MPP engine provided high‑performance query execution without the need for an external compute layer.

SQL Dialect Transformation

A custom Pinot Dialect Transformer automatically rewrote existing Pinot/Trino SQL to StarRocks syntax, covering over 70% of statements without manual changes. The transformer adjusts function names, argument orders, and supports future extensions.

Semi‑Structured Data Handling

StarRocks introduced Flat JSON and Variant data types. Flat JSON reduced disk usage by ~80% and, after table‑level configuration of sparsity and null factors, improved query latency. Variant provides efficient storage and query of dynamic JSON schemas, with metadata and value separation for fast access.

Performance Improvements

~70% of queries run faster on StarRocks than on Trino.

Average query latency improved by ~50% (up to 21% on cache‑hit runs).

Materialized view usage yielded >10× performance gains.

Flat JSON reduced storage footprint by ~80% and cut query latency by a similar margin after bucket‑key and sort‑key optimizations.

Operational Enhancements

Unified permission management via Apache Ranger and LDAP, integrated with UDP Auth.

Automatic backup & restore, and Sync Tool for cross‑cluster data migration.

Enhanced indexing: new tokenize function for debugging inverted indexes, and extended match operators (MATCH_ALL, MATCH_ANY) with push‑down support.

Future Roadmap

Query Insight: richer profiling and automated optimization suggestions.

Enhanced semi‑structured support: continue improving Variant shredding and Flat JSON indexing.

Text search optimization: expand inverted index capabilities, explore new engines (Tantivy, native StarRocks search).

Original OLAP Stack Overview
Original OLAP Stack Overview
Challenges of Pinot
Challenges of Pinot
Desired OLAP Engine Features
Desired OLAP Engine Features
Why Upgrade to StarRocks
Why Upgrade to StarRocks
MigrationBig DataStarRocksOLAPPinot
StarRocks
Written by

StarRocks

StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.