How Ctrip Builds a Scalable User Profile Platform for Personalized Travel
This article explains why Ctrip creates user profiles, describes the product and technical architectures, and details the data collection, computation, storage, high‑availability querying, and monitoring components that power its personalized travel recommendations and services.
1. Why Ctrip Builds User Profiles
Ctrip uses user profiles to power recommendation algorithms that match products to user preferences and to provide personalized services, thereby improving user experience and reducing unwanted interruptions.
2. Architecture of Ctrip User Profiles
2.1 Product Architecture
All profiles are registered in the UserProfile platform, reviewed, and then flow into the data warehouse. The pipeline includes registration, data collection, computation, storage/query, and monitoring.
2.2 Technical Architecture
Ctrip’s large‑scale system emphasizes loose coupling and high cohesion, with a BU‑oriented management model. Profiles are processed across BUs, using open‑source DataX and Storm to move data into a cross‑BU UserProfile data warehouse, cached by Redis, and accessed via real‑time and Elasticsearch‑based APIs.
3. Components of Ctrip User Profiles
3.1 Data Collection
Basic information is gathered from UserInfo, UBT (behavior), orders, crawlers, and mobile apps. Each data source has a dedicated collection process, illustrated by the order‑information collection flow.
3.2 Profile Computation
Collected raw data is transformed into valuable profiles. Asynchronous batch jobs (Hive, DataX) handle most calculations, while real‑time streams (Kafka + Storm) update time‑sensitive profiles such as user behavior.
3.3 Data Storage
The profile data, considered classic "big data," is stored in a sharded distributed warehouse with 160 shards across four physical clusters, employing cross‑IDC hot‑standby, SSDs, and other high‑availability technologies.
3.4 High‑Availability Query
API response time must stay below 250 ms; real‑time services achieve an average of 8 ms (99 % under 11 ms) using self‑degradation, circuit‑breaker, and traffic‑shaping techniques. Batch queries for large user groups use Elasticsearch.
3.5 Monitoring and Tracing
Multi‑layer monitoring validates profile accuracy across dimensions such as user level, hotel star rating, and flight class, and tracks variance over time to trigger re‑evaluation of algorithms.
All these components form Ctrip’s cross‑BU user profile platform, which continues to evolve with new technologies.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
