Optimizing Recall in Travel Recommendation Systems: Challenges and Solutions at Alibaba's Fliggy
This article explains how Fliggy's travel recommendation platform tackles recall challenges such as cold‑start users, sparse behavior, itinerary‑specific needs, and periodic repurchase by applying user‑attribute models, graph embeddings, dual‑tower architectures, session‑based methods, and statistical repurchase forecasting to improve candidate selection and overall recommendation performance.
Recall is a fundamental module of recommendation systems that selects a subset of items from a massive pool as candidates for higher‑level ranking, directly determining the upper bound of recommendation effectiveness.
Common recall methods include user‑profile based, collaborative filtering, and embedding similarity, but travel scenarios face specific challenges: long demand cycles, sparse and divergent user behavior, severe cold‑start issues, popularity bias, and the need to handle itinerary‑specific requirements.
The Fliggy travel recall presentation covers four main topics: cold‑start user recall, itinerary expression and recall, behavior‑based recall, and periodic repurchase recall.
Fliggy's recommendation pipeline consists of five stages—full item pool, recall, coarse ranking, fine ranking, and mixing. Recall is performed offline using vector retrieval or scoring, while later ranking stages may use real‑time scoring or online learning.
For cold‑start users, three strategies are used: global hot items, cross‑domain mapping, and multi‑attribute user‑based recall (UserAttr2I). The solution evolves from a linear FTRL model to a DeepNet with GBDT feature selection and attention, achieving the highest click‑through rate among cold‑start methods.
Order2I represents individual orders as vectors and applies graph embedding to learn pairing relationships (e.g., visa, Wi‑Fi, hotels). Challenges such as data sparsity, noisy graphs, and limited coverage are addressed by constructing a weighted heterogeneous graph using travel knowledge graphs and performing weighted graph embedding.
Journey2I aggregates multiple orders within a trip, enriches features with itinerary intent, user attributes, and attention mechanisms, and employs a dual‑tower architecture with offline item vectors and online vector retrieval, resulting in a clear CTR improvement over Order2I.
Behavior‑based recall leverages session‑based I2I to handle sparse and divergent travel behavior. Traditional collaborative filtering (ItemCF, Swing) is enhanced by session segmentation and a meta‑path graph embedding that incorporates the travel knowledge graph, improving relevance while maintaining coverage.
Periodic repurchase recall (Rebuy2I) predicts the probability of a user repurchasing a hotel at a specific time using a Poisson‑Gamma statistical model, adjusting for purchase frequency and interval, and demonstrates superior performance compared to retargeting and hot‑item recall.
The concluding insights emphasize understanding business scenarios, data‑driven problem discovery, targeted model and feature improvements, and balancing recall with ranking to achieve optimal trade‑offs between effectiveness and efficiency.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.