Artificial Intelligence 15 min read

Evolution of Re‑ranking Techniques in Kuaishou Short‑Video Recommendation System

This article details the technical evolution of Kuaishou's short‑video recommendation pipeline, focusing on sequence re‑ranking, multi‑content mixing, and on‑device re‑ranking, and explains how transformer‑based models, generator‑evaluator frameworks, and reinforcement‑learning strategies are employed to maximize overall sequence value, user engagement, and revenue.

DataFunTalk

Feb 10, 2022

Evolution of Re‑ranking Techniques in Kuaishou Short‑Video Recommendation System

Kuaishou operates a large‑scale short‑video and live‑streaming platform with diverse business scenarios, generating massive interaction data that creates complex recommendation challenges such as large‑scale estimation, reinforcement learning, and causal analysis.

The presentation is organized into four parts: an overview of Kuaishou's recommendation scenario, sequence re‑ranking, multi‑content mixing, and on‑device re‑ranking.

Sequence Re‑ranking addresses the fact that a sequence's overall value is not merely the sum of individual item scores; context and ordering heavily influence user behavior. Traditional point‑wise scoring, greedy shuffling, and MMR/DPP methods have limitations, prompting a shift to transformer or LSTM models that embed upstream content, an optimization objective focused on the whole sequence, and continuous discovery of effective ordering patterns.

The system adopts a generator‑evaluator paradigm: a generator creates diverse candidate sequences from the top‑50 items, and an evaluator (a unidirectional transformer followed by an auxiliary embedding model) predicts the overall sequence score, achieving significant online gains.

Various sequence generation strategies are discussed, including beam search, multi‑queue weighting that approximates a Pareto frontier, listwise mixing, and reinforcement‑learning (Duel DQN) approaches that balance long‑term user experience with short‑term revenue.

Multi‑Content Mixing aims to combine results from different business streams into a single sequence that maximizes overall social value while respecting diversity constraints, moving beyond simple scoring‑based ordering.

On‑Device Re‑ranking tackles real‑time perception, immediate feedback, personalized ("thousand‑users‑thousand‑models") modeling, and compute allocation. By incorporating real‑time user signals (e.g., volume, orientation) and lightweight transformer interactions, the on‑device model improves CTR by 2.53 pp, LTR by 4.81 pp, and WTR by 1.36 pp.

The talk concludes with a Q&A covering evaluation of generated sequences, personalization strategies, and diversity requirements, emphasizing the importance of causal and contextual coherence in recommendation sequences.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

recommendation system reinforcement learning on-device inference Re‑ranking Kuaishou sequence modeling multi-content mixing

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.