How Multi‑Objective Optimization Boosted Taobao Search’s Coarse Ranking
This report details the multi‑stage architecture of Taobao’s main search, introduces a new global‑transaction hitrate metric, analyzes offline and online evaluation gaps, and presents a series of model, loss‑function, and sampling improvements that together lifted overall conversion by about one percent.
Background
Taobao main search is a typical multi‑stage retrieval system consisting of recall, coarse ranking (also called "sea‑selection"), and fine ranking stages. The recall stage outputs roughly 10^5 items, coarse ranking narrows this to about 10^3, and fine ranking finally selects the top‑10 items for exposure.
Model Basics
The coarse‑ranking model differs from fine ranking mainly in its training samples, which include exposure samples (exposed by fine ranking), non‑exposure samples (exposed by coarse ranking but not by fine ranking), and random negative samples. These three types are concatenated into a listwise format, forming a sample vector of length
.
The loss function combines three objectives—exposure, click, and transaction—using a listwise softmax and adds a distillation loss that aligns coarse‑ranking scores with fine‑ranking scores.
New Evaluation Metric
To better assess coarse ranking, a global transaction hitrate metric was introduced, measuring both in‑scene and out‑of‑scene conversions. This metric was evaluated across the full recall‑→‑coarse‑→‑fine funnel, revealing that coarse ranking excels at the 10^3‑10^4 range compared to fine ranking.
Offline Metric Analysis and Corrections
Metrics such as coarse‑ranking hitrate@10, NDCG between coarse and fine scores, and AUC were used to quantify coarse‑to‑fine loss and recall‑to‑coarse loss. Adjustments to negative sampling (inspired by Word2Vec) and sample augmentation with out‑of‑scene transactions improved out‑of‑scene hitrate by up to 1.04 pt.
Optimization Methods
Key optimizations included expanding distillation samples, aligning coarse‑ranking features with fine‑ranking features, introducing cross‑tower features, and experimenting with MLP layers. Distillation sample expansion yielded +0.65 pt NDCG and +0.3 pt out‑of‑scene hitrate.
Feature enhancements (user portrait and long‑term transaction sequence) added +0.4 pt hitrate, while cross‑tower features gave +0.2 pt hitrate offline but no clear online gain.
Loss‑function refinements that removed extra positive samples from the softmax denominator increased coarse‑to‑fine consistency (+0.63 pt NDCG) and raised out‑of‑scene hitrate by ~0.2 %.
Further Findings
Analysis showed that increasing fine‑ranking score volume does not always improve online performance; over‑scoring can reduce global hitrate and overall conversion.
Conclusion
Through multi‑objective optimization, negative‑sample re‑weighting, feature alignment, and loss‑function tweaks, the coarse‑ranking stage achieved roughly a 1 % increase in overall transaction value, highlighting the importance of jointly optimizing recall‑to‑coarse and coarse‑to‑fine losses.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
