Tagged articles
121 articles
Page 2 of 2
DataFunTalk
DataFunTalk
Feb 27, 2021 · Artificial Intelligence

Optimizing Coarse Ranking Models for Short Video Recommendation: From GBDT to Dual‑Tower DNN and Cascading

This article details the practical upgrades of iQIYI's short‑video recommendation coarse‑ranking pipeline, moving from a GBDT model to a dual‑tower DNN, applying knowledge distillation, embedding compression, inference optimizations, and finally a cascade architecture to align with the fine‑ranking model while reducing resource consumption.

cascading modelcoarse rankingdual-tower DNN
0 likes · 12 min read
Optimizing Coarse Ranking Models for Short Video Recommendation: From GBDT to Dual‑Tower DNN and Cascading
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 26, 2021 · Artificial Intelligence

Optimization of Coarse Ranking Models for Short‑Video Recommendation at iQIYI

iQIYI’s short‑video recommendation team replaced a GBDT coarse‑ranking model with a lightweight dual‑tower DNN, applied knowledge distillation, sparse‑aware embedding optimization, and inference merging, then introduced a cascade MMOE architecture, achieving comparable accuracy with half the memory, ~19 ms latency reduction, and measurable gains in watch time, CTR and engagement.

cascade modelcoarse rankingdual-tower DNN
0 likes · 15 min read
Optimization of Coarse Ranking Models for Short‑Video Recommendation at iQIYI
58 Tech
58 Tech
Jan 15, 2021 · Artificial Intelligence

Exploring Text Pre‑training Models for Dialogue Classification in Information Security: From TextCNN to RoBERTa and Knowledge Distillation

This article presents a systematic exploration of text pre‑training models for dialogue classification in information‑security scenarios, comparing baseline TextCNN, an enhanced TextCNN_role, RoBERTa with domain‑adaptive pre‑training, and a distilled mini‑model, and discusses their performance, trade‑offs, and future directions.

Dialog ModelingNLPknowledge distillation
0 likes · 13 min read
Exploring Text Pre‑training Models for Dialogue Classification in Information Security: From TextCNN to RoBERTa and Knowledge Distillation
DataFunTalk
DataFunTalk
Jan 15, 2021 · Artificial Intelligence

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

This talk by Zhihu search algorithm engineer Shen Zhan details the evolution of text relevance models from TF‑IDF/BM25 to deep semantic matching and BERT, explains the challenges of deploying BERT at scale, and describes practical knowledge‑distillation techniques that improve both online latency and offline storage while maintaining search quality.

BERTknowledge distillationmachine learning
0 likes · 14 min read
Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices
DataFunTalk
DataFunTalk
Jan 10, 2021 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

This article presents a comprehensive overview of Didi's machine translation platform, covering its evolution from statistical to neural models, the Transformer architecture with relative position and larger FFN, data preparation, training strategies such as back‑translation and knowledge distillation, deployment optimizations with TensorRT, and the team's successful participation in the WMT2020 news translation task.

BLEUNeural NetworksTensorRT
0 likes · 14 min read
Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience
Sohu Tech Products
Sohu Tech Products
Jan 6, 2021 · Artificial Intelligence

Overview of Main Model Compression and Acceleration Techniques: Structural Optimization, Pruning, Quantization, and Knowledge Distillation

This article reviews four mainstream model compression and acceleration methods—structural optimization, pruning, quantization, and knowledge distillation—explaining their principles, implementations, and performance, and presents practical examples such as DistillBERT, TinyBERT, and FastBERT with comparative results.

AIDeep Learningknowledge distillation
0 likes · 14 min read
Overview of Main Model Compression and Acceleration Techniques: Structural Optimization, Pruning, Quantization, and Knowledge Distillation
DataFunTalk
DataFunTalk
Dec 25, 2020 · Artificial Intelligence

Exploring Pretraining Model Optimization and Deployment Challenges in NLP

This article reviews the evolution of pretraining models in NLP, discusses the practical challenges of deploying large models such as inference latency, knowledge integration, and task adaptation, and presents Xiaomi’s optimization techniques including knowledge distillation, low‑precision inference, operator fusion, and multi‑granularity segmentation for dialogue systems.

BERTDialogue SystemsInference Optimization
0 likes · 15 min read
Exploring Pretraining Model Optimization and Deployment Challenges in NLP
Didi Tech
Didi Tech
Oct 27, 2020 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Didi's machine translation system combines a Transformer‑big architecture with relative position representations, enlarged feed‑forward networks, iterative back‑translation, knowledge‑distillation and domain fine‑tuning, optimized via TensorRT for speed, achieving a BLEU 36.6 and third place in the WMT2020 Chinese‑to‑English news task.

BLEUNeural NetworksTensorRT
0 likes · 15 min read
Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience
Didi Tech
Didi Tech
Oct 21, 2020 · Artificial Intelligence

Deep Model Compression Techniques for Intelligent Automotive Cockpits

The article reviews deep‑model compression methods—ADMM‑based structured pruning, low‑bit quantization, and teacher‑student knowledge distillation—and their automated AutoCompress workflow, demonstrating how these techniques shrink neural networks enough to run real‑time driver‑monitoring and other intelligent cockpit functions on resource‑limited automotive hardware while preserving accuracy.

ADMMDeep Learningedge AI
0 likes · 16 min read
Deep Model Compression Techniques for Intelligent Automotive Cockpits
Meituan Technology Team
Meituan Technology Team
Jul 9, 2020 · Artificial Intelligence

Optimizing Meituan Search Ranking with BERT: Methods and Practices

The Meituan Search team boosted ranking relevance by training a domain‑specific BERT, applying data augmentation, brand‑sample optimization, knowledge‑graph fusion, multi‑task and pairwise fine‑tuning, joint end‑to‑end training with LambdaLoss ranking models, and compressing the model for low‑latency inference, delivering up to +925 BP offline accuracy gains and measurable CTR and NDCG improvements in production.

BERTknowledge distillationmachine learning
0 likes · 34 min read
Optimizing Meituan Search Ranking with BERT: Methods and Practices
AntTech
AntTech
Jun 9, 2020 · Artificial Intelligence

Deep Learning Model Compression and Acceleration Techniques for Mobile AI

This article reviews the motivations, challenges, and a comprehensive set of algorithmic, framework, and hardware methods—including structural optimization, quantization, pruning, and knowledge distillation—to compress and accelerate deep learning models for deployment on mobile devices, highlighting benefits such as reduced server load, lower latency, improved reliability, and enhanced privacy.

Mobile AIknowledge distillationmodel compression
0 likes · 17 min read
Deep Learning Model Compression and Acceleration Techniques for Mobile AI
DataFunTalk
DataFunTalk
May 26, 2020 · Artificial Intelligence

Knowledge Distillation Techniques for Recommendation Systems: Methods, Scenarios, and Practical Insights

This article reviews how knowledge distillation—using a large teacher model to guide a smaller student model—can be applied across the recall, coarse‑ranking, and fine‑ranking stages of recommendation systems, detailing logits‑based and feature‑based approaches, joint and two‑stage training, and point‑wise, pair‑wise, and list‑wise loss designs.

Recommendation Systemsknowledge distillationmachine learning
0 likes · 31 min read
Knowledge Distillation Techniques for Recommendation Systems: Methods, Scenarios, and Practical Insights
Tencent Tech
Tencent Tech
Feb 27, 2020 · Artificial Intelligence

How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques

Deep learning models often suffer from slow training and deployment due to their size, but a range of advanced acceleration methods—including model architecture optimization, pruning, quantization, knowledge distillation, and distributed training techniques—can dramatically improve speed and efficiency while maintaining performance.

Deep LearningDistributed Trainingknowledge distillation
0 likes · 14 min read
How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques
Qunar Tech Salon
Qunar Tech Salon
Feb 27, 2020 · Artificial Intelligence

iQIYI Dual‑DNN Ranking Model with Online Knowledge Distillation

This article describes iQIYI’s dual‑DNN ranking architecture that combines a high‑capacity teacher network with a lightweight student network via online knowledge distillation, addressing the trade‑off between model effectiveness and inference efficiency in large‑scale recommendation systems.

CTR predictionOnline LearningRanking Models
0 likes · 12 min read
iQIYI Dual‑DNN Ranking Model with Online Knowledge Distillation
DataFunTalk
DataFunTalk
Feb 22, 2020 · Artificial Intelligence

Double DNN Ranking Model with Online Knowledge Distillation for Real‑Time Recommendation at iQIYI

The article introduces iQIYI's double‑DNN ranking architecture that combines a high‑performance teacher network with a lightweight student network through online knowledge distillation, detailing the evolution of deep learning‑based ranking models, the motivation for model upgrades, training pipelines, and experimental results that demonstrate significant latency reduction and ROI improvement.

Deep LearningOnline LearningRanking Models
0 likes · 13 min read
Double DNN Ranking Model with Online Knowledge Distillation for Real‑Time Recommendation at iQIYI
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 21, 2020 · Artificial Intelligence

Dual DNN Ranking Model with Online Knowledge Distillation for Recommender Systems

iQIYI’s dual‑DNN ranking model uses an online teacher‑student knowledge‑distillation framework where a complex teacher DNN shares representations with a lightweight student DNN, enabling end‑to‑end training, large‑scale feature crossing, and substantially higher recommendation accuracy while cutting inference latency and model size.

CTR predictionOnline Learningdual DNN
0 likes · 15 min read
Dual DNN Ranking Model with Online Knowledge Distillation for Recommender Systems
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 17, 2020 · Artificial Intelligence

Ultrafast Video Attention Prediction with Coupled Knowledge Distillation

The paper presents UVA‑Net, a lightweight video‑attention network trained via coupled knowledge distillation, which matches the accuracy of eleven state‑of‑the‑art models while using only 0.68 MB of storage and achieving up to 10,106 FPS on GPU (404 FPS on CPU), thanks to a MobileNetV2‑based CA‑Res block and a teacher‑student framework that leverages low‑resolution inputs to drastically cut parameters and computational cost.

Mobile Video ProcessingUVA-Netknowledge distillation
0 likes · 5 min read
Ultrafast Video Attention Prediction with Coupled Knowledge Distillation
Hulu Beijing
Hulu Beijing
Apr 30, 2019 · Artificial Intelligence

How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained

This article reviews why deep neural networks are over‑parameterized, outlines the challenges of deploying them on mobile and embedded devices, and presents six major strategies—pruning, low‑rank approximation, filter selection, quantization, knowledge distillation, and novel architecture design—to accelerate and compress models while preserving performance.

Deep Learningknowledge distillationmodel acceleration
0 likes · 11 min read
How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 9, 2018 · Artificial Intelligence

How Rocket Launching Boosts Online CTR Prediction Without Slowing Inference

Rocket Launching introduces a novel co‑training framework that jointly trains a lightweight network and a more powerful booster network, sharing parameters and using gradient‑blocking and hint loss to improve click‑through‑rate prediction accuracy while keeping online inference latency unchanged, validated on public datasets and Alibaba’s ad system.

CTR predictionco-traininggradient block
0 likes · 13 min read
How Rocket Launching Boosts Online CTR Prediction Without Slowing Inference
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 11, 2018 · Artificial Intelligence

Rocket Launching: Boosting Real-Time CTR Prediction Without Extra Latency

Online click‑through‑rate (CTR) prediction demands millisecond‑level response times, yet deep models are too slow; this paper introduces a “Rocket Launching” framework that jointly trains a lightweight net and a powerful booster net, sharing parameters and using gradient‑blocking and hint loss to improve accuracy without increasing inference latency.

CTR predictionDeep Learningco-training
0 likes · 13 min read
Rocket Launching: Boosting Real-Time CTR Prediction Without Extra Latency