Artificial Intelligence 15 min read

Towards a Better Tradeoff between Effectiveness and Efficiency in Pre‑Ranking: A Learnable Feature‑Selection‑Based Approach

The authors introduce an interaction‑focused pre‑ranking model combined with a learnable, complexity‑aware feature‑selection technique (FSCD) that selects a compact feature set, enabling Alibaba’s search advertising system to boost offline AUC from 0.695 to 0.737, raise recall to 95 %, improve CTR and RPM, yet retain CPU usage and latency comparable to traditional vector‑dot models.

Alimama Tech
Alimama Tech
Alimama Tech
Towards a Better Tradeoff between Effectiveness and Efficiency in Pre‑Ranking: A Learnable Feature‑Selection‑Based Approach

In large‑scale search, recommendation and advertising systems, a multi‑stage ranking architecture (recall, pre‑ranking, ranking, re‑ranking) is commonly used to meet ultra‑low latency constraints.

During the pre‑ranking stage, representation‑focused (RF) vector‑dot models are favored for efficiency but suffer from reduced effectiveness. The Alibaba Search Advertising ranking team proposes a novel pre‑ranking method that adopts an interaction‑focused (IF) architecture and introduces a learnable feature‑selection technique called FSCD (Feature Selection based on Complexity and Variational Dropout) to achieve a better trade‑off between efficiency and effectiveness.

The FSCD method assigns a learnable dropout factor to each feature domain, models the prior retention probability as a function of feature complexity, and incorporates both cross‑entropy loss for effectiveness and a complexity‑regularization term for efficiency. A relaxed continuous approximation of the Bernoulli distribution enables gradient‑based optimization.

After FSCD selects a compact subset of features, the IF‑based pre‑ranking model is fine‑tuned using a standard loss. Because the selected feature set is a subset of the full ranking model’s features, the pre‑ranking pipeline can fully reuse the ranking pipeline’s offline sample generation, training, and inference resources, resulting in near‑zero additional storage and minimal extra computation.

Extensive online experiments on Alibaba’s search advertising platform show that the proposed IF pre‑ranking model, with the FSCD‑selected features, improves offline AUC from 0.695 to 0.737 and raises recall from 88 % to 95 %. Online metrics such as click‑through rate (CTR) and revenue per mille (RPM) also see significant gains while CPU usage and latency remain comparable to the baseline vector‑dot model.

In summary, the work demonstrates that incorporating interaction‑focused structures into pre‑ranking, together with a complexity‑aware learnable feature selection, can simultaneously boost effectiveness and maintain efficiency, and the solution has been deployed at Alibaba Search Advertising at scale.

Machine LearningSearch Advertisingeffectivenessefficiencyfeature selectionpre‑ranking
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.