Artificial Intelligence 14 min read

Multi-Objective Optimization in Short Video Recommendation at iQIYI

iQIYI improves short‑video recommendation by applying multi‑objective optimization—weighting clicks by watch duration, fusing separate click and watch‑time models, employing multi‑task learning with ESMM/MMOE and Pareto‑guided PSO hyper‑parameter search—delivering 7%+ watch‑time growth, 20%+ interaction gains, and 1.5‑3% CTR lifts while planning cross‑scene learning and further model refinements.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Multi-Objective Optimization in Short Video Recommendation at iQIYI

Short videos are characterized by rich content, concentrated information, and high user stickiness. Improving the efficiency of short‑video distribution and the precision of recommendations is a core capability and modeling goal of recommendation systems.

This article shares iQIYI’s experience in multi‑objective optimization for short‑video ranking, covering the business background, a variety of solution approaches, and future planning.

In iQIYI’s short‑video recommendation business, traffic comes from two main sources: (1) the “SuiKe” video feed in the iQIYI APP bottom tab and top‑navigation hot modules, and (2) the homepage feed of the iQIYI SuiKe APP.

User feedback on the feed is divided into explicit actions (click, follow, comment, like, share, dislike, report) and implicit signals (watch duration, completion rate, rapid scrolling).

The initial ranking model optimized a multi‑objective of click + watch time. As the business evolved, stronger interaction signals (comments, likes) and reduction of short‑stop content were added, resulting in a 7%+ increase in average watch time and a 20%+ boost in interaction metrics.

1) Click‑through‑rate (CTR) estimation with duration‑weighted samples: positive samples are weighted by watch time, and equal‑frequency bucketing of video duration and playtime is used to assign normalized weights in the range [0, 99].

2) Multi‑model fusion: separate binary click models and regression watch‑time models are trained, and their scores are combined (addition or multiplication) with hyper‑parameters tuned via grid search.

3) Multi‑task learning – network optimization: two mainstream approaches are explored – modeling task sequence dependencies (e.g., ESMM) and optimizing shared bottom representations (e.g., MMOE with Pareto optimization). A particle swarm optimization (PSO) framework is employed to search hyper‑parameters for multi‑objective fusion, aiming to approximate the Pareto front.

Online results show consistent gains: CTR improvements of 1.5%–3%, UCTR gains of 0.2%–1%, and average watch‑time increases of 0.6%–7% across the different methods.

In summary, iQIYI has explored sample‑weight design, model architecture, and multi‑objective fusion, achieving notable online benefits. Future work will focus on cross‑scene multi‑objective learning, further model optimization (gradient, loss design, shared representation), and online hyper‑parameter search frameworks.

recommendation systemModel Fusionmulti-task learningshort videomulti-objective optimizationparticle swarm optimization
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.