Big Data 17 min read

How Ctrip Built a Near‑Real‑Time Lakehouse with Flink & Paimon

This article details Ctrip Business Travel’s implementation of a near‑real‑time data warehouse and lakehouse using Flink CDC and Apache Paimon, covering order wide‑table construction, automated ticket reminders, ad attribution, batch‑stream integration, and lessons on Partial Update, Aggregation, and Tag‑based incremental processing.

Ctrip Technology
Ctrip Technology
Ctrip Technology
How Ctrip Built a Near‑Real‑Time Lakehouse with Flink & Paimon

Background

Ctrip Business Travel provides an end‑to‑end travel management platform for corporate customers, covering flights, hotels, car rentals, and trains. Rapid business growth increased data volume and complexity, making the traditional T+1 offline warehouse insufficient for near‑real‑time analytics. The team therefore explored a lakehouse‑in‑one approach using Apache Paimon to improve data freshness.

Near‑Real‑Time Data Warehouse Construction

2.1 Real‑Time Order Wide Table Development

The order detail wide table is a core application layer supporting ad‑hoc queries. Previously built with hourly offline jobs, it suffered from latency and complex ETL involving dozens of tables. Attempts with Flink + Kafka revealed instability and high maintenance costs.

Introducing Paimon solved these issues: its lakehouse nature supports upserts and dynamic writes, while the Partial Update feature replaces multi‑stream joins, improving stability and development efficiency.

Data flow: ODS data from MySQL is captured by Flink CDC and written to Paimon; the EDW layer uses Paimon’s Partial Update and Aggregation engines to build wide tables, also using Paimon tables as dimension stores to replace HBase/Redis for lookup joins.

2.1.1 Building Order Product Information Wide Table with Partial Update

Product tables (train tickets, flight tickets, generic product info) share the same primary key (col1, col2). A Paimon wide table is created with the merge‑engine set to partial‑update, and a sequence group controls update order across streams, resulting in a consolidated order product wide table.

Compared with Flink multi‑stream joins, Partial Update eliminates state storage overhead, lightens checkpointing, and directly writes field‑level changes to Paimon, achieving higher stability and lower latency.

2.1.2 Building Intermediate Order Wide Table with Aggregation

When source tables have mismatched primary keys, the Aggregation engine with nested_update is used. Three streams with different keys are aggregated into a wide table keyed by col1. Since Paimon’s Aggregation does not support count, a sum(case when …) pattern emulates counting.

2.1.3 Dimension Lookup Join Using Paimon

Traditional real‑time warehouses rely on external stores (HBase, Redis, MySQL) for dimension data. By storing dimensions in Paimon, the architecture simplifies, reduces operational cost, and maintains acceptable lookup performance for small dimension tables.

2.2 Real‑Time Ticket Refund Reminder Optimization

The original T‑1 offline job calculated next‑day reminders, causing a two‑day data lag. By moving core fields (order number, ticket status) to a Flink + Paimon near‑real‑time pipeline and keeping final reminder selection in a Spark batch job, the system achieves minute‑level freshness while preserving stability.

2.3 Near‑Real‑Time Advertising Order Attribution

Advertising requires minute‑level reporting of hotel bookings linked to ad clicks within the preceding three days. The solution aggregates seven MySQL tables using Paimon’s Aggregation + Partial Update, employs partition‑level expiration to keep click logs for three days, and registers result tables via an internal DaaS API for downstream consumption.

Key Implementation Steps

ODS layer built with Flink CDC; bucket count tuned to ~1 GB per bucket.

Aggregation for Partial Update constructs wide tables, using sum for promotional discounts.

Partition‑level expiration (4‑day retention) ensures timely cleanup of click logs.

Filesystem catalog is used to read/write Paimon tables, simplifying permission management.

/opt/app/spark-3.2.0/bin/spark-sql \
  --conf 'spark.sql.catalog.paimon_catalog=org.apache.paimon.spark.SparkGenericCatalog' \
  --conf 'spark.sql.extensions=org.apache.paimon.spark.extensions.PaimonSparkSessionExtensions' \
  --conf 'spark.sql.storeAssignmentPolicy=ansi' \
  -e "select * from paimon.dp_lakehouse.ods_xxx limit 10;"

Batch‑Stream Integration Practice

Full stream‑batch unification is not yet feasible because Flink’s batch semantics differ from Spark’s. The current Lambda architecture uses Flink CDC to write to Paimon (stream) and Spark to read the same Paimon tables for batch processing.

Tag‑Based Incremental Computation

Paimon supports Tags, which snapshot the table at a point in time. By creating daily Tags on ODS tables, incremental queries between Tags retrieve only changed data, enabling efficient incremental ETL. In practice, this reduced processing time by 4‑5× compared with full‑load jobs.

Conclusion

The Flink CDC + Paimon architecture successfully supports multiple near‑real‑time scenarios at Ctrip Business Travel. Partial Update outperforms regular joins, and Aggregation replaces traditional GROUP BY. Tag‑based incremental processing further improves batch efficiency, delivering higher data freshness, lower latency, and easier operations.

Future Plans

The team will continue to explore deeper Flink‑Paimon stream‑batch integration, aiming to move beyond the current Lambda architecture toward tighter compute‑storage convergence.

FlinkPaimonReal-time Data WarehouselakehouseBatch-Stream IntegrationPartial Update
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.