How Alibaba Boosts Search Relevance with Advanced User Modeling and Self‑Attention
This article details Alibaba’s Taobao search CTR/CVR user modeling approach, covering background, model architecture with self‑attention and attention pooling, handling short‑term, long‑term, and on‑device behavior sequences, experimental results showing AUC improvements, and future directions.
Background and Significance
User modeling is a core technology for search and recommendation. In Taobao search, the ranking target is the triple. While item features are dense and stable, user features are sparse, requiring extensive generalized features.
Static user and item features improve model generalization, but incorporating real‑time user behavior greatly enhances sample discrimination and classification accuracy. User modeling is thus treated as information abstraction and organization.
We continuously enrich modeling methods:
User profile to represent static user attributes.
Preference tags mined from behavior to predict general user preferences.
Real‑time behavior modeling for fine‑grained interest description.
Behavior data are organized by cycle (short‑term vs. long‑term) and content (explicit click/purchase vs. implicit exposure).
Model Architecture
The overall model concatenates user profile, multiple user behavior sequence features, target item features, and real‑time context (weather, network, time) before feeding them into a DNN classifier. Sequence modeling uses self‑attention and attention‑pooling: self‑attention captures inter‑item dependencies, while attention‑pooling matches sequence items to the current query.
User Data and Modeling
User profile provides static attributes supplementing user_id. Real‑time behavior (click, add‑to‑cart, purchase) is crucial for capturing current interest. We define a unified behavior schema containing item attributes (item_id, seller_id, etc.) and behavior attributes (type, timestamp, position).
Short‑term (mid‑term) behavior sequences are filtered by query‑predicted categories to keep only relevant history. Long‑term behavior is defined as the user's transactions over the past two years, divided into quarterly sub‑sequences to preserve seasonal preferences.
Short‑Term Sequence Modeling
We replace dot‑product attention with cosine similarity + scaling to improve softmax logits discrimination. Query‑aware attention pooling further activates history items consistent with the current query.
Long‑Term Sequence Modeling
Quarterly sequences are embedded, masked, and passed through multi‑layer self‑attention. Quarterly representations are concatenated (or pooled) to form the final long‑term preference vector, enabling seasonal personalization.
On‑Device Click and Exposure Modeling
On‑device click sequences provide millisecond‑level real‑time data, including detailed page interactions. Modeling follows the same self‑attention + attention‑pooling pipeline, using short‑term sequence vectors as queries.
Exposure (impression) sequences, representing items shown but not clicked, are aggregated via mean pooling and contrasted with positive clicks using a margin loss to differentiate user dislike from relevance.
Experiments and Analysis
Datasets consist of online exposure and click logs; training uses recent N days, testing on the following day. Evaluation focuses on AUC improvements.
Compared with baselines (e.g., DUPN), our optimizations yield up to 0.3% absolute AUC gain from sequence modeling and up to 0.7% from new sequence features.
Attention weight visualizations show that early training emphasizes recent items, while later training captures inter‑item relationships, reflected in diagonal‑heavy attention maps.
Removing self‑attention reduces AUC by ~0.001, confirming its importance for modeling item dependencies.
Target attention was omitted in search because explicit query intent already provides strong signals and target attention adds computational overhead.
Long‑term quarterly modeling demonstrates similar attention patterns, confirming the relevance of seasonal behavior.
On‑device click attention maps exhibit strong diagonal and neighboring weights, indicating temporal proximity effects.
Incorporating exposure items improves AUC by +0.002, showing that non‑clicked impressions convey useful negative signals.
Conclusion and Outlook
We presented a comprehensive user modeling framework for Taobao search CTR/CVR, leveraging enriched user profiles, real‑time behavior, and multi‑scale sequence modeling. Deployed in the Double‑11 campaign, the system significantly boosted GMV. Future work includes finer user data perception, more scientific data organization, and exploring better model structures.
References
Devlin J, Chang MW, Lee K, et al. BERT: Pre‑training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805, 2018.
Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need. NeurIPS, 2017.
Ni Y, Ou D, Liu S, et al. Perceive Your Users in Depth: Learning Universal User Representations from Multiple E‑Commerce Tasks. KDD, 2018.
Zhou G, Mou N, Fan Y, et al. Deep Interest Evolution Network for Click‑Through Rate Prediction. AAAI, 2019.
Ren K, Qin J, Fang Y, et al. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. arXiv:1905.00758, 2019.
Li C, Liu Z, Wu M, et al. Multi‑Interest Network with Dynamic Routing for Recommendation at Tmall. CIKM, 2019.
Qi Pi, Bian W, Zhou G, et al. Practice on Long Sequential User Behavior Modeling for Click‑Through Rate Prediction. KDD, 2019.
Ouyang W, Zhang X, Li L, et al. Deep Spatio‑Temporal Neural Networks for Click‑Through Rate Prediction. KDD, 2019.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
