Coarse Ranking in Recommendation Systems: Architecture, Models, and Optimization
Coarse ranking bridges recall and fine ranking by trimming tens of thousands of candidates to a few hundred or thousand using a three‑part framework—sample construction, ordinary and cross‑feature engineering, and evolving deep models—from rule‑based to lightweight MLPs, while employing distillation, feature crossing, pruning, quantization, and bias mitigation to balance accuracy with strict latency constraints.
Introduction – Coarse ranking (粗排) sits between recall and fine ranking in a recommendation pipeline. It processes tens of thousands of candidate items from recall and outputs a few hundred to a few thousand items for fine ranking, representing a classic trade‑off between precision and latency.
1. Overall Architecture – The coarse‑ranking module receives a large candidate set from recall and produces a reduced set for fine ranking. In small‑scale recommendation pools, coarse ranking may be omitted.
2. Basic Framework
Coarse ranking is typically model‑based and consists of three parts: data samples, feature engineering, and deep models.
Data Samples
Training samples are constructed similarly to fine ranking: exposed and clicked items are positive, exposed but not clicked are negative. Because the candidate space is much larger than in fine ranking, the sample‑selection‑bias (SSB) problem is more severe.
Feature Engineering
Features are divided into two categories to meet the strict 10‑20 ms latency requirement:
Ordinary features – user, context, and item attributes, similar to those used in fine ranking.
Cross features – interactions between user and item that improve accuracy but are expensive to compute and store, thus used cautiously (e.g., wide‑&‑deep style).
Deep Models (Four Generations)
First generation: handcrafted rule‑based strategies using statistics such as CTR, CVR, price range, sales, etc.
Second generation: linear Logistic Regression (LR) models offering limited personalization.
Third generation: DSSM twin‑tower models that decouple user and item representations. Two variants: Both user and item vectors stored offline; inference is a simple inner product. User vectors computed online while item vectors remain offline, reducing latency.
Fourth generation: lightweight MLP models (e.g., COLD) that use SE blocks, feature pruning, and network pruning to balance accuracy and latency.
3. Optimization Strategies
Accuracy Improvement
Fine‑ranking distillation – using a fine‑ranking model as a teacher to train the coarse‑ranking student.
Feature crossing – either handcrafted cross features (wide part) or model‑based crossing via FM/MLP.
Feature distillation – teacher uses both ordinary and cross features; student learns high‑order information from the teacher while using only ordinary features.
Wide‑&‑deep architecture – combines a twin‑tower deep part with a wide cross‑feature part.
Lightweight MLP (e.g., COLD) – achieves feature crossing within a single tower using SE blocks and pruning.
Latency Reduction
Feature pruning – discard irrelevant features early (as in COLD).
Quantization & fixed‑point conversion – e.g., 32‑bit to 8‑bit.
Network pruning – remove redundant neurons or weights.
Model distillation – already covered under accuracy improvement.
Neural Architecture Search (NAS) – discover lighter yet effective architectures.
4. Sample‑Selection‑Bias (SSB) Issue
The large solution space of coarse ranking amplifies SSB because only exposed samples are used. One mitigation is to reuse fine‑ranking scores of unexposed items to provide additional supervision.
Author
Yang‑Yi Xie, Research Engineer at Tencent, specializes in video recommendation algorithms with extensive experience in NLP and search‑ranking.
For further reading, see the linked articles on recall, fine ranking, and related topics.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.