Artificial Intelligence 9 min read

Coarse Ranking in Recommendation Systems: Architecture, Models, and Optimization

Coarse ranking bridges recall and fine ranking by trimming tens of thousands of candidates to a few hundred or thousand using a three‑part framework—sample construction, ordinary and cross‑feature engineering, and evolving deep models—from rule‑based to lightweight MLPs, while employing distillation, feature crossing, pruning, quantization, and bias mitigation to balance accuracy with strict latency constraints.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Coarse Ranking in Recommendation Systems: Architecture, Models, and Optimization

Introduction – Coarse ranking (粗排) sits between recall and fine ranking in a recommendation pipeline. It processes tens of thousands of candidate items from recall and outputs a few hundred to a few thousand items for fine ranking, representing a classic trade‑off between precision and latency.

1. Overall Architecture – The coarse‑ranking module receives a large candidate set from recall and produces a reduced set for fine ranking. In small‑scale recommendation pools, coarse ranking may be omitted.

2. Basic Framework

Coarse ranking is typically model‑based and consists of three parts: data samples, feature engineering, and deep models.

Data Samples

Training samples are constructed similarly to fine ranking: exposed and clicked items are positive, exposed but not clicked are negative. Because the candidate space is much larger than in fine ranking, the sample‑selection‑bias (SSB) problem is more severe.

Feature Engineering

Features are divided into two categories to meet the strict 10‑20 ms latency requirement:

Ordinary features – user, context, and item attributes, similar to those used in fine ranking.

Cross features – interactions between user and item that improve accuracy but are expensive to compute and store, thus used cautiously (e.g., wide‑&‑deep style).

Deep Models (Four Generations)

First generation: handcrafted rule‑based strategies using statistics such as CTR, CVR, price range, sales, etc.

Second generation: linear Logistic Regression (LR) models offering limited personalization.

Third generation: DSSM twin‑tower models that decouple user and item representations. Two variants: Both user and item vectors stored offline; inference is a simple inner product. User vectors computed online while item vectors remain offline, reducing latency.

Fourth generation: lightweight MLP models (e.g., COLD) that use SE blocks, feature pruning, and network pruning to balance accuracy and latency.

3. Optimization Strategies

Accuracy Improvement

Fine‑ranking distillation – using a fine‑ranking model as a teacher to train the coarse‑ranking student.

Feature crossing – either handcrafted cross features (wide part) or model‑based crossing via FM/MLP.

Feature distillation – teacher uses both ordinary and cross features; student learns high‑order information from the teacher while using only ordinary features.

Wide‑&‑deep architecture – combines a twin‑tower deep part with a wide cross‑feature part.

Lightweight MLP (e.g., COLD) – achieves feature crossing within a single tower using SE blocks and pruning.

Latency Reduction

Feature pruning – discard irrelevant features early (as in COLD).

Quantization & fixed‑point conversion – e.g., 32‑bit to 8‑bit.

Network pruning – remove redundant neurons or weights.

Model distillation – already covered under accuracy improvement.

Neural Architecture Search (NAS) – discover lighter yet effective architectures.

4. Sample‑Selection‑Bias (SSB) Issue

The large solution space of coarse ranking amplifies SSB because only exposed samples are used. One mitigation is to reuse fine‑ranking scores of unexposed items to provide additional supervision.

Author

Yang‑Yi Xie, Research Engineer at Tencent, specializes in video recommendation algorithms with extensive experience in NLP and search‑ranking.

For further reading, see the linked articles on recall, fine ranking, and related topics.

Artificial Intelligencemodel optimizationfeature engineeringrecommendation systemscoarse ranking
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.