Design and Implementation of a Real-Time Dynamic Tag Processing Platform for Trip.com International Business
The article describes the background, challenges, architecture, operator design, DAG processing, tag persistence, and business applications of a real-time dynamic tag processing platform (CDP) built to improve revenue growth and cost reduction for Trip.com's international operations.
To address the high acquisition cost, complex product mix, and diverse market channels of Trip.com's international business, the team proposes a Customer Data Platform (CDP) that provides fine‑grained, real‑time tag processing to grow revenue and reduce costs.
The system is divided into two major scenarios: a real‑time trigger path that subscribes to change messages and pushes filtered results downstream, and a tag‑persistence path that stores processed features in a distributed relational store (TiDB) for both OLAP and OLTP queries. The overall architecture follows a Kappa‑style streaming layer combined with a Lambda‑style batch layer.
For the real‑time trigger, dynamic rules are configured in JSON and interpreted by a rule engine that assembles operators such as Stream, Priority, Join, Filter, and Sink into a directed acyclic graph (DAG). The DAG is executed eagerly, allowing rule updates without restarting tasks, and results are pushed directly to downstream queues (QMQ) to meet low‑latency requirements.
Operator details include: Stream – ingestion from Kafka or QMQ; Priority – ranking based on business flow importance; Join – Redis‑backed right‑table join; Filter – conditional pruning of records; Sink – output to QMQ, TiDB, MySQL, etc. Basic atomic operators (+, -, *, /, >, <, =, IN, LIKE, etc.) and custom functions (string, time, JSON) are also supported.
The DAG concept is inspired by Spark’s lineage model but simplified: all tasks run in a single stage without a separate DAGScheduler, enabling high concurrency and scalability.
Tag persistence leverages TiDB’s dual storage engines—TiKV (row‑oriented) for OLTP and TiFlash (column‑oriented) for OLAP—allowing both real‑time incremental writes (via Kafka/QMQ) and batch loads (via Spark) to keep tag data fresh and query‑ready.
Business applications include real‑time push notifications, email campaigns, and other marketing scenarios where dynamic rules automatically generate tasks, filter user behavior, assign depth‑based tags, and deliver personalized content without additional code development.
The article concludes with a brief invitation for talent, highlighting the team’s focus on cutting‑edge distributed computing, large‑scale data processing, and algorithmic innovation.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.