Exploring Contrastive Learning in Kuaishou Recommendation Systems
This article presents a comprehensive overview of how contrastive learning can alleviate data sparsity and distribution bias in recommendation systems, detailing its theoretical advantages, recent research progress in computer vision and NLP, and a multi‑task self‑supervised framework applied to Kuaishou's short‑video ranking pipeline with significant offline and online performance gains.
Recommendation systems often suffer from data sparsity and distribution bias, which manifest as difficulty capturing diverse user interests, sparse feedback signals, and insensitive negative feedback. Contrastive learning can mitigate these issues by extracting latent label information from the data itself, augmenting samples to approximate the true data distribution, and providing self‑supervised signals that enrich the main task.
Background : Current recommendation models exhibit various biases, such as popularity bias and difficulty capturing individual user preferences, largely due to sparse interactions and skewed data distributions. Traditional solutions like dimensionality reduction, IPW, or re‑sampling either lose useful information or suffer from high variance.
Advantages of Contrastive Learning :
Alleviates data sparsity by mining implicit label information.
Reduces distribution bias through sample augmentation that better reflects the true data distribution.
Provides self‑supervised signals that improve the main recommendation task.
The method learns representations that satisfy alignment (similar samples are close) and uniformity (representations are evenly spread).
Recent Research Progress :
In computer vision, MoCo introduced a memory bank for large‑batch negative sampling, while SimCLR removed the memory bank by relying on strong augmentations. SimSiam further eliminated negative samples using asymmetric networks and stop‑gradient.
In NLP, methods such as ConSERT and SimCSE showed that feature‑level and model‑level augmentations (e.g., dropout) can prevent representation collapse.
In recommendation, Alibaba’s Multi‑CLRec added intent‑aware memory banks, and Meituan’s S³‑Rec designed four self‑supervised tasks to capture item‑attribute, sequence‑item, sequence‑attribute, and segment relationships.
Application in Kuaishou :
Kuaishou’s short‑video recommendation collects multiple feedback signals (watch time, likes, dislikes, shares, etc.). The traditional approach linearly weighted each signal, which ignored scale differences and led to bias. A new composite reward function (shown in the figure) normalizes scores between 0 and 1, improving robustness.
The ranking framework incorporates three auxiliary contrastive tasks:
User‑item contrast to debias popular items.
User‑user contrast to capture diverse user interests using sequence augmentations (mask, shuffle, sample).
Positive‑negative feedback contrast to differentiate fine‑grained user signals.
Negative samples are drawn from a global memory bank and in‑batch sampling, balancing bias reduction and computational efficiency. Positive samples are augmented to prevent gradient drowning.
Experimental results show that the contrastive framework improves average app time by 0.46% in ranking (far above the typical 0.1% confidence interval) and reduces hate rate by 8.92%. Offline metrics on the SASRec baseline improve HR@20 by 23%, HR@50 by 16%, and NDCG@20 by 28%.
Summary & Reflections : The multi‑granularity self‑supervised framework successfully addresses popularity bias, user interest diversity, and fine‑grained feedback discrimination. It is modular, plug‑and‑play, and can be extended to other stages (recall, content generation) and domains (commercialization). Key practical takeaways include the importance of abundant negative samples, balanced positive‑sample weighting with augmentation, and selecting appropriate augmentation strategies for ranking (mask) versus recall (sample).
Q&A Highlights :
Contrastive learning focuses on representation similarity, whereas pair‑wise modeling learns relative ordering.
The composite reward is additive because feedback types are independent in Kuaishou.
Negative sampling combines global and in‑batch strategies to mitigate bias.
Offline validation compares against state‑of‑the‑art baselines on both internal and public datasets, confirming gains especially for long‑tail items.
For further details, the authors provide downloadable resource collections and invite the community to follow DataFunTalk for more AI and big‑data content.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.