Big Data 15 min read

How Ctrip Cut Query Latency by 85% with StarRocks’ Compute‑Storage Separation

Ctrip migrated its massive User Behavior Tracking system from ClickHouse to a compute‑storage separated StarRocks cluster on Kubernetes, achieving millisecond‑level query latency, halving storage usage, reducing node count, and sustaining millions‑of‑rows‑per‑second write throughput while simplifying scaling and operations.

Ctrip Technology
Ctrip Technology
Ctrip Technology
How Ctrip Cut Query Latency by 85% with StarRocks’ Compute‑Storage Separation

Background

UBT (User Behavior Tracking) is Ctrip's core data collection and analysis platform. It ingests tens of terabytes of user‑behavior events per day and retains data for up to 30 days (some tables for a year). The system serves Android, iOS and NodeJS SDKs and powers troubleshooting, monitoring and aggregation analytics.

Problems with ClickHouse

Write path suffered data loss and backlog because historical back‑fill triggered large partition compressions, exhausting CPU and I/O.

Horizontal scaling required costly data migration; ClickHouse’s monolithic compute‑storage design limited elasticity and demanded three‑replica storage, inflating hardware costs.

Large‑time‑range queries were slow, with average response times in seconds, failing real‑time analysis requirements.

Gohangout, the custom ClickHouse ingestion tool, was single‑process, fragile and hard to configure.

StarRocks Migration

StarRocks originally used an integrated compute‑storage model that delivered sub‑second query latency but suffered from rigid resource utilization and high replication cost. The team adopted StarRocks’ newer compute‑storage separation architecture, where stateless compute nodes (CN) query data stored in remote object storage (OSS/S3) and optionally enable a DataCache for hot data. This design eliminates data movement during scaling, allowing a node to be added or removed in seconds; a typical UBT scaling operation now completes in about 5 seconds .

StarRocks on Kubernetes

The production cluster is managed with the open‑source starrocks‑kubernetes‑operator (https://github.com/StarRocks/starrocks-kubernetes-operator). The team extended the operator by:

Implementing an ElasticVolumeSet controller to replace StatefulSet and provide dynamic disk expansion.

Adding a SET‑based CRD that lets a single StarRocks custom resource define multiple instance specifications, enabling gray‑release, rolling upgrades and A/B testing without downtime.

A unified portal wraps common operational tasks into point‑and‑click actions, improving operational efficiency for non‑engineers.

Stability and Tuning

5.1 Storage Design

UBT stores petabytes of data; the largest tables reach hundreds of terabytes and ingest millions of rows per second. StarRocks uses system‑time hourly partitions and a bucket count of 128, reducing write fragmentation. Compression switched from LZ4 to ZLIB, saving ~30 % storage.

5.2 Compaction

Control compaction parameters (file count, thread count).

Leverage hourly partitions and bucket sizing to lower compaction frequency.

Enable MergeCommit to merge small writes into larger versions, cutting file count and I/O.

Monitoring uses native_compactions and partitions_metas tables, plus SHOW PROC COMPACTIONS, with Grafana dashboards to keep the Compaction Score below 100.

5.3 Data Backfill

Approximately 100 TB of historical data were migrated from ClickHouse/Hive to StarRocks using SparkLoad . Spark cleanses the data, writes it to HDFS, and StarRocks pulls it from HDFS, avoiding MemoryStore writes and minimizing compaction overhead.

5.4 Real‑time Increment

Real‑time ingestion uses Flink → StarRocks with MergeCommit . Compared to per‑request commits, MergeCommit reduces version count, I/O operations and latency, delivering up to 10× higher IOPS and 10× larger I/O size per write.

5.5 Query Optimization

Detail queries apply mandatory partition pruning and prefix indexes. Aggregation queries use hourly partition‑level materialized views (Partition MV) that refresh only changed partitions. An optimization changed the MV refresh algorithm from O(M×N) to O(M+N), dramatically improving FE response when partitions exceed ten thousand.

Results

Storage volume halved due to single‑replica design.

Node count reduced by 10 , improving resource utilization.

Average query latency dropped from 1.4 s to 203 ms (≈1/7).

P95 latency fell from 8 s to 800 ms (≈1/10).

Write throughput remains stable at several million rows per second.

The team plans to continue moving other lake‑warehouse workloads to the compute‑storage separated model for greater flexibility and scalability.

UBT architecture diagram
UBT architecture diagram
Migration flow diagram
Migration flow diagram
StarRocks compute‑storage separation
StarRocks compute‑storage separation
Compaction monitoring
Compaction monitoring
Write throughput chart
Write throughput chart
data migrationPerformance optimizationbig dataKubernetesStarRocksClickHouseCompute-Storage Separation
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.