Overview of Ranking Algorithms in Recommendation Systems
This article reviews the evolution of ranking models in modern recommendation systems, covering traditional linear models, factorization machines, tree‑based GBDT+LR, and a range of deep learning architectures such as Wide&Deep, DeepFM, DCN, xDeepFM, DIN, as well as multi‑task frameworks like ESMM and MMOE, and finally illustrates their practical deployment in a live streaming platform.
Overview of Ranking Algorithms
Modern recommendation systems are typically divided into two stages: recall and ranking. The recall stage quickly filters millions of candidates down to thousands using low‑cost models, while the ranking stage applies more sophisticated features and models to produce the final top‑K results.
In the past decade, ranking models have progressed rapidly—from simple Logistic Regression (LR) to Factorization Machines (FM) in 2010, to tree‑based models such as GBDT introduced by Facebook in 2014, marking the first half of modern recommender development. Since 2015, deep learning models (DNNs) have surged, bringing a plethora of architectures, feature‑crossing tricks, and ideas that now dominate CTR and recommendation tasks.
1. Traditional Models
1.1 LR
Before deep learning, LR was favored for its simplicity, speed, and interpretability, modeling the probability of a click as a weighted sum of features passed through a sigmoid function. However, its linear nature cannot capture non‑linear feature interactions, requiring extensive manual feature engineering.
1.2 FM/FFM
To address LR's limitation on feature crossing, polynomial models were proposed that explicitly model pairwise interactions, but they suffer from sparsity and scalability issues. Factorization Machines (FM) introduce latent vectors for each feature, reducing parameter count and enabling generalization to unseen feature pairs while retaining efficient training.
1.3 GBDT + LR
GBDT provides automatic feature engineering by growing decision trees whose leaf nodes represent high‑order feature combinations. These leaf indices are used as new sparse features concatenated with original ones and fed into an LR model, combining the strengths of both approaches.
Summary of Traditional Models
The progression from manual rule‑based ranking to LR, then to FM for second‑order interactions, and finally to GBDT+LR for higher‑order automatic feature crossing outlines the early trajectory of recommender ranking models.
2. Deep Models
Since 2015, deep learning models have become prevalent and can be grouped into two categories: Wide&Deep‑style models that learn high‑order feature interactions automatically, and multi‑task learning models such as ESMM and MMOE.
2.1 Wide&Deep‑style Models
These models consist of a "wide" shallow component (often LR) that memorizes frequent low‑order features, and a "deep" component (MLP) that generalizes via high‑order interactions. Common building blocks include: raw input->embedding: Transform sparse features into dense vectors. input_layer: Aggregate embeddings. input_layer->output: Stack MLP layers and a softmax output.
Key models:
Wide&Deep
Wide part memorizes high‑frequency low‑order features; deep part learns implicit, element‑wise high‑order interactions.
DeepFM
Replaces the wide LR with FM to automatically capture second‑order interactions while sharing the embedding layer with the deep component.
DCN
Introduces a cross‑layer network that can express arbitrary high‑order feature combinations with linear parameter growth, avoiding the combinatorial explosion of explicit crossing.
xDeepFM
Combines a Compressed Interaction Network (CIN) with a DNN to learn explicit high‑order vector‑level interactions while maintaining linear complexity.
DIN
Uses attention over a user's historical item embeddings to produce a weighted representation tailored to the current item, effectively performing automatic feature selection.
2.2 Multi‑Task Models
Multi‑objective optimization is crucial in live‑streaming platforms where clicks, watch time, gifts, comments, follows, and shares are all important. Two representative models are ESMM and MMOE.
ESMM
Models the entire sample space, jointly learning click‑through rate (pCTR) and post‑click conversion rate (pCTCVR) by sharing bottom‑layer embeddings and multiplying logits to estimate the probability of click + conversion, thus mitigating sample‑selection bias.
Task
Positive Sample
Negative Sample
pCTR
Click
No click
pCTCVR
Click & conversion
No conversion
MMOE
Introduces task‑specific expert networks (the "switch" mechanism) that allow each task to draw different aspects from shared embeddings, reducing negative transfer when tasks are weakly related.
Summary of Deep Models
Deep models have shifted the focus from manual feature engineering to automatic high‑order interaction learning, with architectures that support both implicit (MLP) and explicit (CIN, cross layers) feature crossing, as well as multi‑task frameworks that balance diverse business objectives.
Practical Deployment in Huajiao Live
Huajiao Live adopts a variety of ranking models—including GBDT+LR, Wide&Deep, (x)DeepFM, DIN, ESMM, and MMOE. The offline pipeline uses Spark/HDFS to process user, anchor, and real‑time features, generating a massive T‑scale dataset stored on HDFS.
Features include user profile (gender, age, device, region, interaction statistics), anchor profile (demographics, level, category, tags, live duration, gifts, followers), and real‑time signals (current viewership, gifts, chat activity, hotness, content type).
Training is performed on the private 360 cloud platform hbox for distributed deep‑model training. The final model architecture combines embedding layers, deep networks, and task‑specific towers.
Offline AUC results (×100): FM 78.9, Wide&Deep 84.5, DeepFM 84.7. Online, personalized recommendation increased average watch time by over 80% for popular channels.
Conclusion
This article provides a concise overview of widely used ranking models in the industry, noting that no single model is universally best; the optimal choice depends on the specific scenario and data characteristics. Researchers are encouraged to study the original papers for detailed formulas, hyper‑parameter settings, and engineering tricks.
"User interest in e‑commerce browsing exhibits multi‑modal distribution and partial activation, which motivated Alibaba to develop the DIN model to capture evolving user interests and achieve breakthrough performance."
Effective recommendation starts with a clear scenario, followed by model selection tailored to data patterns, rather than imposing a model a priori.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Huajiao Technology
The Huajiao Technology channel shares the latest Huajiao app tech on an irregular basis, offering a learning and exchange platform for tech enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
