Artificial Intelligence 19 min read

Baidu's Recommendation Ranking: Background, Feature Design, Algorithms, Architecture, and Future Directions

This article presents Baidu's comprehensive approach to feed recommendation ranking, covering business and data background, feature engineering principles, core algorithmic strategies, system architecture design, and upcoming plans to integrate large language models for more intelligent and fair recommendations.

DataFunTalk
DataFunTalk
DataFunTalk
Baidu's Recommendation Ranking: Background, Feature Design, Algorithms, Architecture, and Future Directions

The article shares Baidu's thinking and practice on recommendation ranking, organized into five main parts.

Background : Introduces the business and data context of Baidu's integrated information‑flow recommendation, highlighting the massive scale (hundreds of billions of impressions daily) and the need for high‑throughput, low‑latency models.

Data Background : Describes three challenges—large scale, stringent latency requirements, and strong Matthew‑effect bias—that drive the design of scalable, unbiased ranking models.

Feature Design : Explains a four‑dimensional feature framework (user, resource, scene, state), emphasizing high‑quality discrete features, cross features, bias features, and sequence features, and outlines principles of discrimination, coverage, and robustness.

Algorithm Strategies : Details the hierarchy of models (recall, coarse‑ranking, fine‑ranking, re‑ranking), the need for decoupled training across stages, handling sample‑selection bias, and techniques for large‑scale discrete DNNs, embedding dimension adaptation, and over‑fitting mitigation.

Architecture : Presents a layered system design based on divide‑and‑conquer, with elastic computation and sparse‑MoE concepts to allocate resources efficiently across diverse recall queues and model stages.

Future Plans : Outlines three directions for the next generation of recommendation systems powered by large language models: decision‑making capabilities, generative recommendation (e.g., generating reasons, data augmentation), and moving from black‑box to white‑box models for causality, fairness, and multi‑task adaptation.

The article concludes with a thank‑you note and references to related past works.

machine learningFeature EngineeringRecommendation systemsranking algorithmsBaidularge-scale AI
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.