Artificial Intelligence 17 min read

How Ctrip Scales Personalized Travel Recommendations: From Recall to Ranking

This article details Ctrip's end‑to‑end personalized recommendation system for travel, covering data collection, candidate recall methods, ranking models, feature engineering practices, and future directions, illustrating how millions of users receive tailored travel suggestions.

21CTO

Jan 18, 2018

How Ctrip Scales Personalized Travel Recommendations: From Recall to Ranking

Ctrip, a leading domestic OTA, serves tens of millions of users daily and relies on personalized recommendation systems to alleviate information overload and match users with suitable travel products.

1. Data

Machine learning is built on data, features, and models. Ctrip leverages product attributes (e.g., location, star rating), product statistics (orders, views, clicks), user profiles (age, gender, preferences), and user behavior (reviews, ratings, browsing, searches, bookings). Statistical metrics such as CTR are often smoothed with Bayesian methods.

2. Recall

The recall stage generates a limited candidate set from millions of items, heavily influencing downstream ranking efficiency and quality. Sparse user‑item interactions in travel pose challenges, so Ctrip combines several effective approaches:

Real‑time Intention : Uses a Markov‑based model on recent user actions to predict immediate intent.

Business Rules : Applies domain‑specific constraints (e.g., recommend hotels only after a flight search).

Context‑Based : Considers seasonal contexts such as winter skiing or New Year travel.

LBS : Leverages GeoHash to filter nearby hotels, attractions, and restaurants based on the user's current location.

Collaborative Filtering : Implements a deep hybrid model aSDAE that incorporates side information to address data sparsity and cold‑start problems.

Sequential Model : Combines Matrix Factorization with Markov chains, and explores RNN/LSTM‑based session recommendations.

Other deep models (DNN, AE, CNN) are also applied where appropriate.

3. Ranking

Personalized ranking treats each user as a multi‑task learning problem, often using conjunction features (user‑product cross features). Commonly used models include Logistic Regression with L1 regularization, Factorization Machines for feature crossing, and Wide‑&‑Deep architectures where the wide part may be replaced by GBDT‑generated features.

Feature engineering remains indispensable and is divided into explicit and semi‑explicit combinations.

Explicit Feature Combination : Discretize features then apply Cartesian product or inner product. Types of features:

Numerical – discretized via equal‑frequency, equal‑width, or supervised methods (1R, entropy‑based).

Ordinal – encoded to reflect order (e.g., three levels of hygiene quality).

Categorical – transformed using OHE, Dummy Encoding, or Hash Trick.

Semi‑Explicit Feature Combination : Tree‑based models (GBDT, Random Forest) generate leaf‑index paths that act as high‑order feature interactions, which are then one‑hot encoded.

4. Summary

The complete recommendation system integrates recall, ranking, list generation, data processing, infrastructure, and front‑end display. Ctrip’s platform serves over ten business lines and sixty scenarios. Future work aims to incorporate more deep models, online learning, reinforcement learning, and transfer learning to further improve recommendation quality.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning personalization Recommendation Systems Ctrip Travel

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.