Multi-Objective Modeling and Practice in DeWu Community Recommendation System
DeWu Community’s recommendation system progressed from single‑objective CTR modeling to a multi‑objective framework that combines independent models for dwell time, video completion and user interactions via score‑fusion, ranking‑learning and multi‑task architectures with shared parameters and gradient‑blocking, delivering higher engagement and retention.
DeWu Community provides a content feed where users browse images and videos, and the recommendation system plays a crucial role in personalized content delivery.
The early system focused on modeling click‑through rate (CTR). As the platform evolved, additional user‑experience metrics such as dwell time, likes, comments, follows, collections, shares, and retention needed to be optimized, leading to multi‑objective recommendation.
Theoretical approaches
Three mainstream solutions for multi‑objective ranking are discussed:
1) Multi‑model score fusion : each objective is modeled by an independent model with its own architecture, training data, and features. At inference time, predictions from all models are combined using a handcrafted fusion formula before final sorting.
2) Ranking learning : point‑wise, pair‑wise, or list‑wise learning methods are used to directly optimize the ordering of items, allowing a single model to handle multiple objectives.
3) Multi‑task learning : shared representations are learned across related tasks. Four sharing patterns are presented – hard parameter sharing, soft sharing, hierarchical (layer‑wise) sharing, and shared‑private sharing – with examples such as ESSM and MMOE architectures.
Practical implementations
Duration model : the first post‑CTR objective, predicting user dwell time. Two sampling strategies (exposure‑based and click‑based) were evaluated; the click‑based approach was deployed. Log loss was used as the loss function and RMSE for evaluation. Model scores are fused with CTR scores using a weighted formula.
Video completion model : predicts the proportion of a video watched, serving as a proxy for user interest. A DeepFM‑style architecture with truncation thresholds was adopted to mitigate length bias.
Interaction model : jointly models explicit actions (likes, comments, follows, collections, shares). In the dual‑column feed, interaction labels are aggregated, and an ESSM network with a single‑tower design is used. Gradient blocking is applied to prevent interaction gradients from harming CTR learning. Various fusion formulas, including channel‑sort normalization and personalized adjustments, are explored.
Fusion framework : to combine scores from CTR, duration, and interaction models, an automated fusion model with a small set of key features is built, reducing latency while supporting rapid iteration of multi‑objective ranking.
Conclusion
The article summarizes DeWu Community’s transition from single‑objective CTR modeling to a comprehensive multi‑objective framework, covering label fusion, shared‑parameter networks, ESSM, gradient blocking, multi‑model fusion, and an automated fusion pipeline, achieving measurable improvements in user engagement and retention.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.