Design and Architecture of the User Profiling System at Ctrip Business Travel
This article describes the concept, tag taxonomy, data flow architecture, and Lambda‑based query service design of Ctrip Business Travel's user profiling system, highlighting how batch and real‑time processing with Spark, Flink, Hive, MongoDB and Redis enable precise marketing, risk control and personalized services.
The article introduces user profiling, originally proposed by Alan Cooper, as a virtual representation of real users built on multi‑dimensional data such as demographics, habits and consumption preferences, and explains its importance for fine‑grained operation and precise marketing at Ctrip Business Travel.
It then details the B2B and B2C tag taxonomy used by Ctrip, covering five major categories—basic attributes, CRM tags, preference tags, real‑time tags, and risk‑control tags—illustrating examples like company ID, activity duration, purchase frequency, flight‑hotel ratios, recent query rates, overdue amounts and credit scores.
The data flow architecture consists of data collection (offline Hive warehouse and online Kafka streams), feature computation (Spark SQL/UDF for batch, Flink for streaming), tag modeling (business‑rule, statistical, and machine‑learning methods), tag serving (Hive, MongoDB, Redis) and monitoring (Zeus, Grafana) to ensure data quality and service reliability.
The query service adopts a Lambda three‑layer architecture: a Batch Layer (Spark, Hive) for historical data, a Speed Layer (Flink) for low‑latency increments, and a Serving Layer (MongoDB, Redis) that merges both to provide fast, accurate user profile queries for downstream applications such as risk detection, recommendation ranking and fine‑grained operation.
In conclusion, building a robust user profiling system requires deep business understanding, careful tag design, reliable data pipelines, high‑availability deployment, and continuous monitoring; future work includes closing the tag‑generation loop and sharing B‑side data with the C‑side to address cold‑start problems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
