Artificial Intelligence 17 min read

Explore‑and‑Exploit (EE) in JD Search: Bias Mitigation, Model Iteration, and Evaluation

The talk presents JD Search's Explore‑and‑Exploit (EE) module, detailing its bias‑mitigation pipeline—including position, popularity, and exposure debiasing—model architecture upgrades with SVGP and causal inference, online AB metrics, offline evaluation methods, and future research directions to improve search diversity and long‑term value.

DataFunTalk
DataFunTalk
DataFunTalk
Explore‑and‑Exploit (EE) in JD Search: Bias Mitigation, Model Iteration, and Evaluation

Speaker: Lv Hao, JD Search Algorithm Expert. Platform: DataFunTalk.

EE (Explore & Exploit) is a key link in JD's search system that improves product diversity and mitigates the Matthew effect caused by ranking bias.

The EE workflow is organized around five core topics: EE scenario iteration loop, model debias iteration, online AB metrics, offline evaluation system, and a final summary.

EE Scenario Iteration Loop

The EE pipeline proceeds from core positioning → online metrics → offline evaluation → model iteration, each step requiring EE‑specific upgrades.

Model Debias Iteration

1. Problem Background

EE aims to surface more efficient long‑tail items. The main challenges are various biases that prevent fair exposure:

Position‑bias: items shown higher receive more clicks regardless of intrinsic quality.

Popularity‑bias: historically popular items dominate when multiple candidates have similar relevance.

Exposure‑bias: only a small subset of items is ever displayed, causing training‑inference distribution gaps.

2. Position‑bias Mitigation

The EE model adopts a two‑stage position‑debias scheme. During training, a position‑bias net encodes the effect of the displayed slot and is fused with the main network. During inference the bias net is masked, removing positional influence.

Two implementation ideas are explored:

Pos as feature: the position index is treated as an additional feature fed to the DNN.

Pos as tower: a separate tower predicts position influence; its output is combined with the main tower logits.

3. Personalized Position‑bias

Standard bias‑net assumes a universal position preference. To capture user‑specific sensitivity, a personalized bias‑net incorporates static user profiles and dynamic behavior sequences, allowing the model to differentiate between “browsers” and “quick‑buyers”.

4. Popularity‑bias Mitigation

Two strategies are used:

Inverse propensity scoring (IPS) to re‑weight training samples.

Popularity‑weight decay on both user and item sides to reduce the dominance of hot items.

Integration of SVGP and Position‑bias Net

The EE ranking model combines a DNN with a Sparse Variational Gaussian Process (SVGP) for uncertainty‑aware scoring. Two fusion methods are discussed:

Representation Fusion: concatenate or sum the position tower embedding with the main tower before SVGP.

Logit Fusion: add or multiply the position contribution directly to the final logit (e.g., Logit = f(content) + f(position)).

Online AB Metrics

The primary online metric is "exploration success rate", which measures how many EE‑selected items later achieve satisfactory traffic, clicks, and orders in the full‑traffic environment. Supporting metrics include overall platform efficiency (UCVR, UV value), liquidity, and richness.

Offline Evaluation System

Because online metrics cannot be directly linked to offline AUC, a dedicated offline suite evaluates EE models on efficiency, long‑tail exploration strength, and uncertainty estimation, accelerating iteration cycles.

Conclusion & Future Work

EE is essential for improving search diversity. By tackling position, popularity, and exposure biases through debiasing, causal inference, and SVGP integration, EE core metrics have shown significant gains while maintaining overall platform efficiency.

Future directions include:

Incorporating richer user exploration signals (explore‑net) and supervision losses.

Designing model structures and loss functions that directly optimize long‑term value.

Enhancing EE candidate generation and exploration mechanisms for end‑to‑end improvement.

For further discussion, contact [email protected].

Machine Learningcausal inferencesearch rankingBias Mitigationexplore‑exploitonline AB testingSVGP
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.