Coarse Ranking in Recommenders: Key Strategies, Metrics & Optimizations
This article systematically reviews the coarse‑ranking stage of recommendation systems, comparing it with recall and fine‑ranking, defining evaluation metrics, detailing sample design, presenting two technical routes, and exploring optimization directions such as dual‑tower models, knowledge distillation, lightweight fully‑connected layers, multi‑objective and multi‑scenario modeling, followed by practical case studies and results.
Background
Coarse‑ranking sits between recall and fine‑ranking, improving recall accuracy and setting an upper bound for fine‑ranking performance.
Positioning
Differences with fine‑ranking : score volume (thousands‑10k vs. hundreds), stricter latency, need to separate liked from disliked items. Differences with recall : candidate set comes from fused recall results, coarse‑ranking must also order items, and both suffer sample‑selection bias (recall bias larger).
Evaluation Metrics
A global Hitrate framework defines two groups of metrics.
Coarse→Fine loss : scene‑internal Hitrate@TopK, NDCG between coarse scores and fine‑ranking efficiency scores, AUC.
Recall→Coarse loss : scene‑external Hitrate@TopK.
Dislike discrimination : scene‑internal Hitrate@TopK on exposure‑without‑click (lower is better).
Samples are collected per request: fused layer outputs, exposed samples, click samples, non‑click samples, global exposure/click samples, and globally corrected exposure‑click samples.
Key offline/online metrics: scene‑click Hitrate@TopK, scene‑non‑click Hitrate@TopK, global click Hitrate@TopK, adjusted global click Hitrate@TopK, NDCG, AUC.
Sample Design
To mitigate stronger sample‑selection bias in coarse‑ranking, the following pools and methods are used.
Negative pool : non‑clicked exposures, all non‑conversion samples, low‑rank fine‑ranking items, recall samples excluding exposure.
Positive pool : clicked exposures, global click samples, delayed click samples (e.g., next‑day clicks).
Sampling methods : random sampling and hot‑item down‑sampling.
Three concrete composition schemes:
Positive: exposure‑click; Negative: exposure‑non‑click.
Positive: exposure‑click + global‑click‑corrected; Negative: exposure‑non‑click + randomly sampled non‑exposure recall samples.
Positive: exposure‑click + high‑rank fine‑ranking items; Negative: exposure‑non‑click + low‑rank fine‑ranking items.
Technical Routes
Two modeling paradigms:
Listwise (set‑based) : models the target set directly, interacts with fine‑ranking, lower stability.
Pointwise (value‑based) : predicts conversion probability per item, higher controllability and independent iteration.
Pointwise is preferred for direct alignment with the final objective.
Development roadmap:
Quality‑score models (e.g., LR, XGBoost).
Deep vector‑inner‑product models such as dual‑tower or triple‑tower structures – fast online, low engineering overhead, limited cross‑feature handling.
Deep cross‑layer models (e.g., COLD framework) – richer cross features at the cost of latency and complexity.
Optimization Directions
Dual‑Tower Enhancements
Insert SENet modules into both user and item towers to dynamically re‑weight important features, improving robustness to noise.
Sequence Feature Learning
Upgrade the user tower with multi‑granularity behavior sequences and query semantics; use LSTM + multi‑head attention for real‑time sequences and pooling for long‑term sequences.
Parallel‑Tower (Inner‑Product Expressiveness)
Parallelize multiple sub‑models (MLP, DCN, FM, CIN) and concatenate their outputs before a final LR layer, enriching representation capacity.
Dual‑Tower Cross‑Enhancement
Add side‑tower information vectors (a_u) to each tower and train with a mimic loss that updates these vectors only for positive labels, injecting cross‑tower signals.
Knowledge Distillation
Use teacher‑student training where the teacher (fine‑ranking) is a more powerful model with privileged features, and the student (coarse‑ranking) is a dual‑tower model. Strategies include privileged‑feature distillation and model distillation.
Auto Feature & Structure Selection (AutoFAS)
Jointly select optimal coarse‑ranking features and architecture under latency constraints using feature masks and MixOp modules, guided by a combined loss of distillation, latency, and coarse‑ranking objectives.
Lightweight Fully‑Connected Layers
Adopt the COLD framework with SEBlock modules to compute feature importance, retain critical features, and accelerate inference via parallelism, quantization, and column‑wise computation.
Multi‑Objective Modeling
Three architectures are explored:
Shared‑parameter multi‑tower (separate user/item towers per objective with shared lower layers).
MMoE‑based dual‑tower where each tower uses a Multi‑Gate Mixture‑of‑Experts to differentiate objectives.
Unified user embedding shared across objectives, enabling joint estimation of conversion‑related goals.
Online fusion formulas (linear addition, exponential multiplication, weighted variants) yield significant offline AUC gains and modest online DPV/UV improvements.
Multi‑Scenario Modeling
Challenges include scenario bias in user/item distributions. Solutions involve:
Scenario statistical features (CTR, CVR per scenario).
Cross‑features between users/items and scenarios.
Embedding scenario features as bias inputs or via dedicated sub‑networks.
Dynamic weighting: reshape scenario features to match each hidden layer’s dimension and multiply with intermediate activations.
Meta‑learning approaches (M2M) use a Meta Unit to capture inter‑scenario relationships and a meta‑attention module for task correlation, enabling rapid adaptation to new scenarios. Two‑stage training (scenario‑supervised contrastive pre‑training followed by fine‑tuning) further refines scenario‑aware representations.
Practical Cases
Single‑objective dual‑tower coarse‑ranking (CTR) using exposure‑click vs. exposure‑non‑click samples: +6 % DPV, +3 % UV.
Triple‑tower multi‑objective coarse‑ranking (CTR + CVR) with weighted exponential multiplication: +5.6 % CTR AUC, +45 % CVR AUC offline, modest online gains.
Multi‑objective + scenario‑feature enhancements: +2 % AUC offline, +2.6 % DPV, +4.9 % UV online.
Conclusion
Coarse‑ranking is a critical lever for recommendation efficiency, offering numerous optimization avenues—from model architecture and knowledge distillation to multi‑objective and multi‑scenario strategies. Ongoing work will continue to refine these directions to further boost system performance.
References
https://zhuanlan.zhihu.com/p/630985673
https://arxiv.org/abs/2005.09683
https://zhuanlan.zhihu.com/p/358779957
https://zhuanlan.zhihu.com/p/409390150
https://mp.weixin.qq.com/s/karPWLyHITu-qZceEhpn-w
https://zhuanlan.zhihu.com/p/608636233
https://zhuanlan.zhihu.com/p/581286422
https://arxiv.org/pdf/1907.05171.pdf
https://arxiv.org/pdf/2205.09394.pdf
https://zhuanlan.zhihu.com/p/186320100
https://arxiv.org/pdf/2102.07142.pdf
https://zhuanlan.zhihu.com/p/500237779
https://zhuanlan.zhihu.com/p/524201399
https://mp.weixin.qq.com/s/gphLbCsimD3w-IoWtdz-pg
https://zhuanlan.zhihu.com/p/496820123
https://blog.csdn.net/abcdefg90876/article/details/128246212
Code example
[1] https://zhuanlan.zhihu.com/p/630985673
[2] https://arxiv.org/abs/2005.09683
[3] https://zhuanlan.zhihu.com/p/358779957
[4] https://zhuanlan.zhihu.com/p/409390150
[5] https://mp.weixin.qq.com/s/karPWLyHITu-qZceEhpn-w
[6] https://zhuanlan.zhihu.com/p/608636233?utm_id=0
[7] https://zhuanlan.zhihu.com/p/581286422
[8] https://arxiv.org/pdf/1907.05171.pdf
[9] https://arxiv.org/pdf/2205.09394.pdf
[10] https://zhuanlan.zhihu.com/p/186320100
[11] https://arxiv.org/pdf/2102.07142.pdf
[12] https://zhuanlan.zhihu.com/p/500237779
[13] https://zhuanlan.zhihu.com/p/524201399
[14] https://mp.weixin.qq.com/s/gphLbCsimD3w-IoWtdz-pg
[15] https://zhuanlan.zhihu.com/p/496820123
[16] https://blog.csdn.net/abcdefg90876/article/details/128246212Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
