Co‑training Disentangled Domain Adaptation Network for Leveraging Popularity Bias in Recommender Systems
This presentation introduces a decoupled domain‑adaptation network that separates popularity and attribute representations to mitigate popularity bias in recommender systems, describing the problem, existing IPS and causal‑inference solutions, the CD2AN architecture, experimental results, and practical Q&A.
The talk begins by describing the "Good Goods" scenario on Taobao, where popular items dominate exposure in both the first‑jump selection page and the second‑jump content page, leading to a Matthew‑effect that harms long‑tail items and reduces personalization.
Popularity bias is defined as the tendency of recommender models to over‑recommend high‑exposure items, causing two main harms: insufficient personalization for users and reduced motivation for creators of niche products.
Current industry solutions are categorized into Inverse Propensity Scoring (IPS), which down‑weights high‑exposure items but requires unstable exposure‑probability estimates, and causal inference methods that model a causal graph (user features u, item features i, click probability c, and popularity factor z) to block the influence of popularity on item representations.
The proposed CD2AN (Co‑training Disentangled Domain Adaptation Network) framework addresses both popularity‑distribution and long‑tail distribution gaps. It first disentangles item ID embeddings into a popularity vector and an attribute vector using a feature‑disentanglement module with orthogonal and popularity‑similarity regularizations.
To align distributions, the model introduces a popularity encoder that learns true popularity from item statistics, and applies Maximum Mean Discrepancy (MMD) loss to bring the domains of popular and tail items closer, while stopping gradients from exposure samples on the domain‑alignment loss and using knowledge distillation from ranking scores for unexposed samples.
Instance‑level alignment is achieved via contrastive learning on co‑occurrence pairs extracted from user behavior sequences, encouraging target items to be close to their contextual items.
A biased‑unbiased joint training scheme combines a bias‑only tower (capturing popularity bias) with an unbiased tower (producing debiased item vectors); the final online representation is a weighted fusion of the two, controlled by a parameter α that adjusts the influence of popularity.
Offline experiments using a C‑Ratio metric show that each module contributes to reducing popularity bias without sacrificing relevance, while online A/B tests confirm that retaining some popularity information improves efficiency, indicating that a balanced use of bias is beneficial.
The session concludes with a Q&A covering how unexposed samples are generated, the impact of adding them, and practical considerations for offline ranking scores and sampling strategies.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.