Artificial Intelligence 13 min read

Double DNN Ranking Model with Online Knowledge Distillation for Real‑Time Recommendation at iQIYI

The article introduces iQIYI's double‑DNN ranking architecture that combines a high‑performance teacher network with a lightweight student network through online knowledge distillation, detailing the evolution of deep learning‑based ranking models, the motivation for model upgrades, training pipelines, and experimental results that demonstrate significant latency reduction and ROI improvement.

DataFunTalk
DataFunTalk
DataFunTalk
Double DNN Ranking Model with Online Knowledge Distillation for Real‑Time Recommendation at iQIYI

With the rapid development of artificial intelligence, deep learning has become widely adopted in industrial recommendation scenarios. Compared with traditional machine‑learning models, deep learning can automatically construct features on the model side, achieving end‑to‑end learning and better performance, but it also brings a conflict between model effectiveness and inference efficiency.

iQIYI proposes an online knowledge‑distillation method to balance these two aspects and introduces a double‑DNN ranking model. The article first explains key concepts such as wider and deeper models, and then reviews the evolution of ranking models in three stages: seed period (introduction of DNN), flourishing period (WDL, DeepFM), and breakthrough period (DCN, xDeepFM).

Analysis of the baseline model (Wide + Deep) reveals three major drawbacks: the GBDT pre‑processing component hinders real‑time training, sparse and dense features are processed separately, and the architecture lacks flexibility for new components. Attempts to replace the baseline with complex models like xDeepFM showed unacceptable inference latency on CPU.

To address these issues, iQIYI designs a double‑DNN framework consisting of a Teacher DNN (high‑performance, slower) and a Student DNN (lightweight, fast). Both share the same embedding layer, while the Teacher includes an additional Feature Interaction Layer. The Student reuses the Teacher’s input‑representation layer and receives supervision from the Teacher’s hidden layers during joint training.

The core advantages are:

Feature Transfer – the Student directly copies and freezes the shared feature representation.

Online Knowledge Distillation – the Teacher’s predictions guide the Student in a single‑stage joint training.

Classifier Transfer – hidden‑layer outputs from the Teacher supervise the Student’s hidden layers, narrowing the performance gap.

The training pipeline includes a 30‑day offline pre‑training of the double‑DNN, followed by fine‑tuning with the latest samples, and an online‑learning hot‑start where the offline model initializes the Student for real‑time updates.

Experimental results on iQIYI’s short‑video and image‑text feed show that the Student model achieves more than three‑fold model size reduction, five‑fold latency improvement, and higher QPS, leading to a higher ROI under the same resource budget. Comparisons with industry practices (Baidu’s CTR‑X, Alibaba’s Rocket Launching) highlight the benefits of joint training and shared embeddings.

In summary, the double‑DNN ranking model with online knowledge distillation provides an effective solution for deploying high‑performance deep ranking models in production, offering significant gains in inference efficiency while maintaining or improving recommendation quality.

deep learningRecommendation systemsknowledge distillationonline learningranking models
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.