Databases 13 min read

How StarRocks Revamped Ctrip’s Ticket Metrics Platform for Lightning‑Fast Queries

Ctrip’s ticket business rebuilt its multi‑engine metrics platform by consolidating ClickHouse, Kylin, and Presto into a single StarRocks database, introducing temporary tables, materialized views, and streamlined ETL, which cut complex query times from minutes to seconds and doubled user traffic.

dbaplus Community

Mar 24, 2024

How StarRocks Revamped Ctrip’s Ticket Metrics Platform for Lightning‑Fast Queries

Background and Pain Points

Ctrip’s ticket business group operates several brands (Tiexin, Ctrip Train, Qunar Train) and needed a metrics platform that could quickly serve diverse KPI queries. The legacy platform combined three storage engines—ClickHouse, Apache Kylin, and Presto—mixing detail and summary layers, which caused data source chaos, high learning and maintenance costs, and painfully slow query performance (some queries waited over 30 minutes).

Why StarRocks?

To simplify the stack and boost speed, the team selected StarRocks, an MPP database that offers:

High performance: sub‑second responses for multi‑billion‑row joins.

MySQL compatibility: low entry barrier for analysts.

High concurrency support.

Multiple data models: detail, aggregate, update, primary‑key.

Materialized view routing.

Flexible ingestion: StreamLoad, SparkLoad, RoutineLoad.

Overall Design of the Rebuilt Platform

The new architecture uses a single StarRocks cluster. Queries first load raw detail data into temporary tables via internal ETL; subsequent aggregations read from these temporary tables, eliminating repeated scans of massive tables.

Query Process

When a KPI request arrives, the system splits it into sub‑queries:

Detail sub‑queries – accumulate raw data for the requested time window (and year‑over‑year data), store results in a daily‑partitioned temporary table (T+1 partition). Example partition creation command:

ALTER TABLE ${table} ADD PARTITION IF NOT EXISTS p${partition} VALUES [('${zdt.addDay(1).format("yyyy-MM-dd")}'),('${zdt.addDay(2).format("yyyy-MM-dd")}')];

Aggregate sub‑queries – compute KPI cards, same‑period comparisons, line charts, and dimension drill‑downs directly in StarRocks, removing the previous Java‑based aggregation layer.

Cache‑Like Reuse

The platform records the first execution of each KPI (dimensions, time range, MD5 of the original SQL, success flag). Subsequent identical requests reuse the previous result via a unique groupkey, effectively acting as a query‑result cache.

Data Synchronization

Over 300 KPI definitions were analyzed, extracting 51 common DWD/DIM tables for unified sync to StarRocks. Three sync patterns are used:

Full sync for small tables (e.g., shareout_trn.dim_ibu_alliance, 608 KB).

Incremental partition sync – daily sync of the previous day’s partition.

Update sync – for large order tables with frequent status changes, using hash‑based diff detection:

SELECT t1.* FROM (SELECT … WHERE d='${cur_day}') t1 LEFT JOIN (SELECT … WHERE d='${pre_day}') t2 ON t1.business_pk_id=t2.business_pk_id WHERE t1.hash_code!=t2.hash_code OR t2.order_id IS NULL;

Partition management scripts (dynamic partitioning, temporary partitions, replace‑partition) are also provided in pre blocks.

Specialized UBT Tables

For ultra‑big‑traffic (UBT) data, three summary tables were created: ubt_for_pv – daily count(uid) per dimension, size only tens of KB. ubt_for_duration – daily sum(duration) per dimension, enabling average stay‑time calculations. ubt_for_uv – daily distinct count, minimizing storage.

Operational Monitoring

Data volume fluctuations exceeding 3σ trigger email alerts; all sync jobs complete within 15 minutes.

Experience Sharing

Table design : avoid excessive bucket numbers (ideal 500 MB–1 GB per bucket) and use appropriate partition granularity (daily for hot tables, monthly for cold tables).

Query tuning : leverage StarRocks indexes, avoid functions in filter predicates, and ensure partitions are used to prevent full‑table scans.

Function differences : StarRocks’ split function is 1‑based, unlike SparkSQL’s 0‑based indexing, which can cause silent result errors. Example:

SELECT split('a,b,c', ',')[0];  -- returns NULL in StarRocks

SELECT split('a,b,c', ',')[1];  -- returns 'a' in StarRocks (should be index 0 in SparkSQL)

Performance Gains

Before reconstruction, complex KPI queries took several minutes; after migration, typical queries finish in ~10 seconds, with P99 latency under 2 seconds, and query volume doubled.

New capabilities such as concurrent sub‑query execution, same‑period comparison, and dimension drill‑down are now available with near‑instant response.

Future Optimizations

Adopt bitmap indexes for UV distinct‑count queries to reduce storage and speed up.

Use aggregate models for full/incremental tables to pre‑aggregate data.

Replace temporary tables with materialized views to simplify the pipeline.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL StarRocks Data Warehouse ETL Metrics Platform

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.