Artificial Intelligence 22 min read

An Overview of Learning to Rank (LTR) Models: Point‑wise, Pair‑wise, List‑wise, and Generative Approaches

This article provides a comprehensive introduction to Learning to Rank (LTR), describing its four major categories—point‑wise, pair‑wise, list‑wise, and generative models—along with typical algorithms such as Wide & Deep, ESMM, RankNet, LambdaRank, LambdaMART, DLCM, and miRNN, and discusses their architectures, loss functions, and practical considerations in advertising and recommendation systems.

DataFunSummit

Mar 24, 2022

An Overview of Learning to Rank (LTR) Models: Point‑wise, Pair‑wise, List‑wise, and Generative Approaches

What is LTR? Learning to Rank (LTR) applies machine learning to ranking problems and is widely used in advertising, search, and recommendation. Published LTR work can be divided into four categories: point‑wise, pair‑wise, list‑wise, and generative models.

1. Point‑wise Models

1.1 Wide & Deep – Proposed by Google in 2016, this model combines a linear (wide) component that captures feature interactions with a deep neural network that learns dense embeddings, balancing memorization and generalization. The wide part uses a generalized linear model with cross‑product features, while the deep part is a feed‑forward network with ReLU activations. Optimization uses FTRL (L1‑regularized) for the wide part and AdaGrad for the deep part.

1.2 Entire Space Multi‑Task Model (ESMM) – An Alibaba model for post‑click conversion rate (CVR) estimation that addresses sample selection bias and data sparsity by jointly modeling CTR and CVR in a multi‑task framework. The architecture contains two sub‑networks (pCVR and pCTCVR) that share embeddings, and the loss combines cross‑entropy terms for both tasks computed over all exposure samples.

2. Pair‑wise Models

2.1 RankNet – Converts ranking into a binary classification problem on document pairs. It uses a sigmoid to model the probability that one document is more relevant than another and employs cross‑entropy loss with stochastic gradient descent for optimization.

2.2 LambdaRank – Extends RankNet by incorporating evaluation metrics (e.g., NDCG) into the gradient (the “lambda” term), directly optimizing the ranking metric rather than a surrogate loss.

2.3 LambdaMART – Builds on LambdaRank by using Gradient Boosted Decision Trees (MART/GBDT) as the base learners. The model fits the lambda gradients instead of the negative gradients, and Newton’s method is used to compute leaf outputs, maximizing a utility function rather than minimizing loss.

3. List‑wise Models

3.1 Deep Listwise Context Model (DLCM) – Uses a GRU to encode the entire top‑N list of documents, producing a contextual representation that re‑ranks the list. The model consists of three steps: (1) retrieve top‑N docs with a traditional LTR model, (2) encode them with a GRU to obtain a hidden state, and (3) apply a local ranking function that combines the GRU outputs and the hidden state. The loss is an Attention Rank loss that treats the ranking list as an attention distribution and uses cross‑entropy.

4. Generative Models

4.1 miRNN (Globally Optimized Mutual Influence Aware Ranking) – Formulates ranking as a sequence generation problem that maximizes expected GMV. It first extends item features with global context, then uses RNN (or attention‑augmented RNN) to model purchase probability and beam search to find the optimal ordering. The model incorporates position bias correction and attention mechanisms to capture long‑range dependencies.

Throughout the article, equations for loss functions, gradients, and probability definitions are presented, and references to original papers (e.g., Wide & Deep, ESMM, RankNet, LambdaRank, LambdaMART, DLCM, miRNN) are provided.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

generative models Learning-to-Rank listwise Pairwise Pointwise Ranking Models

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.