Artificial Intelligence 19 min read

Comprehensive Overview of Ranking Models in Recommendation Systems

The article provides a thorough guide to ranking in recommendation systems, detailing the pipeline architecture, sample handling challenges, extensive feature engineering categories, the evolution from collaborative filtering to deep and attention‑based models, and key optimization trade‑offs between memorization, generalization, and efficient user‑interest modeling.

Tencent Cloud Developer

Mar 15, 2022

Comprehensive Overview of Ranking Models in Recommendation Systems

Introduction – Ranking (精排) is a crucial module in recommendation algorithms. It is usually model‑driven and involves three core components: samples, features, and models. This article provides a detailed exposition to help developers.

1. Overall Architecture

Ranking is a key part of the recommendation pipeline.

2. Samples

Samples are the "food" for models. In a CTR task, exposures with clicks are positive samples, exposures without clicks are negative samples. Main issues include:

Imbalanced positive/negative samples : e.g., a 5% click‑through rate yields a 1:20 positive‑negative ratio. Solutions: negative sampling, focal loss, etc.

Different activity levels : long‑tail users have few samples, high‑frequency users dominate, leading to bias. Mitigation: down‑sample active users or enforce uniform sample counts.

Sample confidence : an unclicked exposure does not always mean the user dislikes the item; sometimes the user has no intent. Strategies include filtering all‑negative users.

Invalid traffic : bot or crawler traffic should be filtered at the recommendation‑engine side.

User intent saturation : Meituan uses "skip‑above" sampling to drop negatives after the last click.

3. Features

Feature engineering remains essential despite the rise of deep models. Typical feature groups:

Context features : weekday, hour, network type, OS, client version, etc.

User features (static): user_id, gender, age, city, occupation, income, student status, marital status, registration time, VIP flag, new‑user flag.

User statistical features : recent 30/14/7‑day PV, VV, CTR, completion rate, average view time (both absolute and relative). Beware of data leakage.

User behavior sequence features : short‑term click sequences, long‑term purchase sequences, positive‑feedback vs. negative‑feedback sequences. Sequence length is a bottleneck for Transformer‑based models.

Item features (static): item_id, author_id, category_id, publish time, resolution, duration, tags.

Item statistical features : recent PV, VV, CTR, completion rate, average view time (same leakage caution).

Cross features : item‑user cross statistics (e.g., CTR per gender/age group).

Feature processing methods:

Discrete values : direct embedding; watch out for high‑dimensional sparse IDs.

Continuous values : (a) concatenate with embeddings (low generalization) or (b) equal‑frequency binning then discretize (better generalization).

Multi‑value features (e.g., behavior sequences): mean‑pooling, sum‑pooling, attention‑pooling (DIN), RNN/GRU‑based sequence modeling, Transformer‑based modeling.

4. Models

Evolution from linear to deep models:

Collaborative Filtering (CF) : userCF, itemCF, matrix factorization (MF). MF reduces sparsity by factorizing the user‑item interaction matrix.

Logistic Regression (LR) family : LR, FM, FFM, LS‑PLM. LR can incorporate rich user/item features but lacks feature crossing (Simpson’s paradox).

Deep Neural Network (DNN) family : DeepCrossing, FNN, PNN – embed sparse IDs, stack embeddings with numeric features, feed into multi‑layer DNN with residual connections.

Wide & Deep (WDL) family : combines LR‑style wide part (memory) with DNN deep part (generalization). Variants include DeepFM, xDeepFM, NFM, etc.

Attention & Sequence models : DIN (attention pooling), DIEN (interest evolution with GRU + target attention), MIMN, SIM – address multi‑value feature modeling and user interest dynamics.

5. Ranking Optimization

Key optimization dimensions:

Memory vs. Generalization : high‑frequency items rely on memorization, low‑frequency items need generalization (addressed by WDL).

Feature & Cross Engineering : manual cross features (user‑item statistics) vs. automatic cross (FM, DeepFM, xDeepFM).

Embedding of High‑Dimensional Sparse IDs : crucial but challenging to converge for long‑tail IDs.

Personalized Behavior Modeling : sequence‑based user interest modeling (attention, RNN, Transformer) with efficiency tricks for long sequences.

Author

Xie Yangyi – Tencent Application Algorithm Researcher, responsible for video recommendation algorithms, with extensive experience in NLP and search/recommendation.

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.