Artificial Intelligence 20 min read

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

This article presents a comprehensive study of generative retrieval for large‑scale e‑commerce search, comparing lexical‑based and Semantic‑ID‑based methods, introducing a Query‑to‑MultiSpan framework, analyzing the sand‑glass distribution problem in residual quantization, and proposing heuristic and adaptive solutions to improve recall and efficiency.

JD Tech
JD Tech
JD Tech
Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

The authors, Wang Hui‑mu and Li Ming‑ming, participated in DataFunsummit2024 and introduced generative retrieval based on large language models to address the challenges of traditional dual‑tower retrieval in massive product catalogs.

Background and challenges : Traditional recall pipelines struggle with efficiency and semantic matching, especially for long‑tail queries, due to limited tower interaction and index maintenance costs.

Advantages of generative retrieval : It avoids link loss, simplifies index management, enhances model performance with advanced LLMs, and leverages world knowledge for better personalization.

Lexical‑based approach : The authors propose a Preference‑Optimized Generative Retrieval framework that redefines the task from Query‑to‑Title to Query‑to‑MultiSpan, using supervised fine‑tuning, DPO preference optimization, and constrained beam search to generate concise spans that are later matched against an efficient FM index.

Semantic‑ID‑based approach : They explore numeric semantic IDs (SID) generated via residual quantization, identifying a “sand‑glass” distribution where middle‑layer tokens become overly concentrated, causing sparsity and long‑tail performance degradation.

Experimental findings : Experiments on models such as LLaMA, Qwen, and Baichuan show that the sand‑glass effect harms head‑token performance while benefiting tail tokens; swapping token layers or removing dominant middle‑layer tokens improves overall recall.

Proposed solutions : Two methods are introduced – a heuristic removal of large routing nodes and an adaptive variable‑length token strategy (top‑K removal) – both effectively mitigating the sand‑glass issue and boosting recall metrics.

Future work : The authors aim to enhance SID representations, integrate multimodal features, and unify generative retrieval with ranking to reduce pipeline loss and improve real‑time search performance.

Team and contact : The work is conducted by JD.com’s Search Algorithm Team, with contact email [email protected] for interested collaborators.

AILarge Language Modelsinformation retrievale-commerce searchgenerative retrievalsemantic IDlexical representation
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.