Artificial Intelligence 16 min read

Learning to Rank: From Regression to Search Ranking and Evaluation Methods

Learning to rank reframes search as a machine‑learning problem that optimizes document ordering rather than numeric prediction, using relevance metrics such as NDCG and feature‑based scoring functions, and comparing point‑wise, pair‑wise (RankSVM) and list‑wise (ListNet) approaches while stressing that proper error definition and feature selection matter more than the specific algorithm.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
Learning to Rank: From Regression to Search Ranking and Evaluation Methods

Learning to rank (LTR) treats search as a machine‑learning problem. This article explains why search differs from classic regression or classification tasks and how to evaluate LTR methods.

Measuring Search Quality

Unlike regression where error is the residual between actual and predicted values, search quality is measured by how close the returned order is to an ideal ranking. Metrics such as NDCG (Normalized Discounted Cumulative Gain) capture both relevance and position bias, ranging from 0 (worst) to 1 (best). Other metrics like ERR and MAP exist, but NDCG is used as the primary relevance indicator here.

Generating a Ranking Function with Machine Learning

Traditional regression learns a function f(x) that predicts a numeric target. In LTR we learn a ranking function f(d, q) that assigns a score to each document d for a query q . The goal is to maximize NDCG over a set of training queries.

Typical features for e‑commerce search include:

TF‑IDF score of the query term in the product title: titleScore(d, q)

TF‑IDF score in the product description: descScore(d, q)

Number of units sold: numSold(d)

A possible scoring formula learned from data is:

f(d, q) = 10 * titleScore(d, q) + 2 * descScore(d, q) + 5 * numSold(d)

Three major families of LTR algorithms are discussed:

Point‑wise (single‑document) learning to rank

Each query‑document pair is given a relevance grade (e.g., 4 for highly relevant, 0 for irrelevant). The model predicts the grade using regression and minimizes the residual y - f(d, q) . This approach ignores the ordering of documents and therefore often fails to capture position bias.

List‑wise and Pair‑wise methods

These approaches directly optimize the ordering. List‑wise methods (e.g., ListNet) define a loss based on the probability of the entire permutation matching the ideal order. Pair‑wise methods (e.g., RankSVM) treat each document pair as a binary classification problem, encouraging the higher‑ranked document to receive a larger score.

ListNet Example (Python‑style pseudocode)

def error(): error = 0 for each query: for each doc: error -= TopOneP(doc.grade) * log(TopOneP(f(doc, query))) return error

Here TopOneP denotes the probability that a document is ranked first given its score.

RankSVM

RankSVM constructs feature differences between document pairs and trains a linear SVM to classify which document should be ranked higher. For example, if dress_shoes sold 5000 units and meh_dress_shoes sold 1000, the pairwise feature is (1, 4000) where 1 is the “better” label.

Both RankSVM and ListNet have trade‑offs: RankSVM is simple and fast but ignores position bias; ListNet captures position bias but is computationally heavier.

Conclusion

The key takeaway is that regardless of the chosen model, understanding what error to minimize and selecting appropriate features are more critical than the specific algorithm.

machine learningsearch rankinglearning to ranklistwiseNDCGPairwisePointwise
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.