Artificial Intelligence 9 min read

Coarse Ranking in Recommendation Systems: Architecture, Models, and Optimization

Coarse ranking bridges recall and fine ranking by trimming tens of thousands of candidates to a few hundred or thousand using a three‑part framework—sample construction, ordinary and cross‑feature engineering, and evolving deep models—from rule‑based to lightweight MLPs, while employing distillation, feature crossing, pruning, quantization, and bias mitigation to balance accuracy with strict latency constraints.

Tencent Cloud Developer

Apr 20, 2022

Coarse Ranking in Recommendation Systems: Architecture, Models, and Optimization

Introduction – Coarse ranking (粗排) sits between recall and fine ranking in a recommendation pipeline. It processes tens of thousands of candidate items from recall and outputs a few hundred to a few thousand items for fine ranking, representing a classic trade‑off between precision and latency.

1. Overall Architecture – The coarse‑ranking module receives a large candidate set from recall and produces a reduced set for fine ranking. In small‑scale recommendation pools, coarse ranking may be omitted.

2. Basic Framework

Coarse ranking is typically model‑based and consists of three parts: data samples, feature engineering, and deep models.

Data Samples

Training samples are constructed similarly to fine ranking: exposed and clicked items are positive, exposed but not clicked are negative. Because the candidate space is much larger than in fine ranking, the sample‑selection‑bias (SSB) problem is more severe.

Feature Engineering

Features are divided into two categories to meet the strict 10‑20 ms latency requirement:

Ordinary features – user, context, and item attributes, similar to those used in fine ranking.

Cross features – interactions between user and item that improve accuracy but are expensive to compute and store, thus used cautiously (e.g., wide‑&‑deep style).

Deep Models (Four Generations)

First generation: handcrafted rule‑based strategies using statistics such as CTR, CVR, price range, sales, etc.

Second generation: linear Logistic Regression (LR) models offering limited personalization.

Third generation: DSSM twin‑tower models that decouple user and item representations. Two variants:

Both user and item vectors stored offline; inference is a simple inner product.

User vectors computed online while item vectors remain offline, reducing latency.

Fourth generation: lightweight MLP models (e.g., COLD) that use SE blocks, feature pruning, and network pruning to balance accuracy and latency.

3. Optimization Strategies

Accuracy Improvement

Fine‑ranking distillation – using a fine‑ranking model as a teacher to train the coarse‑ranking student.

Feature crossing – either handcrafted cross features (wide part) or model‑based crossing via FM/MLP.

Feature distillation – teacher uses both ordinary and cross features; student learns high‑order information from the teacher while using only ordinary features.

Wide‑&‑deep architecture – combines a twin‑tower deep part with a wide cross‑feature part.

Lightweight MLP (e.g., COLD) – achieves feature crossing within a single tower using SE blocks and pruning.

Latency Reduction

Feature pruning – discard irrelevant features early (as in COLD).

Quantization & fixed‑point conversion – e.g., 32‑bit to 8‑bit.

Network pruning – remove redundant neurons or weights.

Model distillation – already covered under accuracy improvement.

Neural Architecture Search (NAS) – discover lighter yet effective architectures.

4. Sample‑Selection‑Bias (SSB) Issue

The large solution space of coarse ranking amplifies SSB because only exposed samples are used. One mitigation is to reuse fine‑ranking scores of unexposed items to provide additional supervision.

Author

Yang‑Yi Xie, Research Engineer at Tencent, specializes in video recommendation algorithms with extensive experience in NLP and search‑ranking.

For further reading, see the linked articles on recall, fine ranking, and related topics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Artificial Intelligence Model Optimization feature engineering Recommendation Systems coarse ranking

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.