How JD’s Dynamic Re‑Ranking Model Boosted Search Relevance and Won SIGIR 2024

The author recounts how, by modeling user intent with a multi‑layer Gaussian‑based PODM‑MI framework and addressing a novel ‘sand‑glass’ bottleneck in RQ‑VAE semantic identifiers, JD’s search ranking achieved significant UCVR gains, annual order increases of over ten million, and a SIGIR 2024 paper acceptance.

JD Tech Talk
JD Tech Talk
JD Tech Talk
How JD’s Dynamic Re‑Ranking Model Boosted Search Relevance and Won SIGIR 2024

At JD, technology is portrayed not as cold code but as a bridge connecting consumers to a better life. The author describes using large models to empower intelligent recommendation and search scenarios, publishing four top‑conference papers, filing eight patents, and being recognized as an outstanding talent.

The transition from academia to industry highlighted the shift from seeking the "optimal solution" to the "most suitable solution" in a complex e‑commerce environment, where challenges such as dynamic user decision stages, ecosystem health, and engineering constraints at billion‑scale traffic cannot be addressed by textbook formulas.

Achieving SIGIR Acceptance with a Product Re‑Ranking Model

JD’s main‑site search optimization revealed that traditional algorithms over‑expose best‑selling items, sacrificing long‑tail exposure. The core issue is that users are in different decision stages: a vague "browsing" stage with diverse queries versus a clear "buying" stage requiring precise results.

The proposed solution, named PODM‑MI , introduces a three‑layer distribution‑modeling framework:

First layer: Gaussian modeling of user preference adjusts accuracy weight when query covariance shrinks (e.g., narrowing from "dress" to "blue floral dress") and boosts diversity weight when covariance expands (e.g., "phone → Switch → range hood").

Second layer: Mutual‑information lower‑bound optimization links ranking diversity tightly to user preference, presenting related items while avoiding irrelevant results.

Third layer: A utility‑matrix fusion module dynamically balances product relevance and diversity during ranking.

The approach delivered a notable increase in the UCVR metric, generating annual order growth exceeding ten million, and was accepted at SIGIR 2024.

Discovering the Industry’s First Technical Bottleneck

While building semantic identifiers (SID) for billions of products using a generative RQ‑VAE method, the team observed a "sand‑glass" distribution: dense clustering in the middle layer caused low code‑table utilization and hindered model training.

Analysis showed that the second‑layer residuals become highly polarized, amplifying the long‑tail effect inherent in e‑commerce data. Two lightweight solutions were proposed:

Remove bottleneck nodes from the middle layer after full SID generation, alleviating the concentration issue.

Introduce an adaptive threshold to dynamically prune overly frequent nodes in the second layer, preserving overall distribution stability.

Experiments demonstrated significant offline recall improvements, enabling users to discover desired products faster.

The discovery exemplifies pure technical innovation—bridging business needs with systematic solutions, and advancing the frontier of generative search recommendation.

Diagram of PODM-MI framework
Diagram of PODM-MI framework
Sand‑glass distribution of SID codes
Sand‑glass distribution of SID codes
Recommendation Systemssearch rankingE-commerce AISIGIRdistribution modelingproduct retrieval
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.