How 1688 Reinvented E‑commerce Search with AI‑Powered Generative Retrieval

This article details Alibaba’s 1688 platform’s shift from traditional e‑commerce search to AI‑driven generative retrieval, covering the AI Deep Search 1.0 and 2.0 cascaded frameworks, multimodal capabilities, an end‑to‑end “model‑as‑search‑engine” approach, experimental results, challenges, and future directions.

DataFunSummit
DataFunSummit
DataFunSummit
How 1688 Reinvented E‑commerce Search with AI‑Powered Generative Retrieval

Background

Traditional B2B e‑commerce search on 1688 suffers from cumbersome query‑to‑product flows, information overload, and passive user experience. The goal is to shift to an AI‑driven "goods find people" paradigm that delivers a one‑stop "find‑and‑pick" solution.

Key Technical Challenges

Training effective models on massive long‑tail product catalogs.

Integrating generative recall with classic vector‑based retrieval.

Demonstrating real‑world efficiency and conversion gains.

AI Deep Search 1.0 – Cascading Framework

1. Query Understanding

The module performs three mappings: (1) scenario → parameters, (2) fragment → structured fields, and (3) user language → product language. A fine‑tuned QwenVL extracts structured fields from product detail pages; the fields are embedded and indexed.

2. Vector Recall

A dual‑tower Siamese network built on Alibaba’s open‑source GTE‑base (supports up to 8K tokens) encodes product documents. Iterative hard‑negative mining (datasets D0 → M0 → D1 → M1 …) progressively improves recall quality.

3. Semantic Ranking

Multi‑granularity distillation aligns a large teacher model bge‑rerank‑v2‑gemma‑9b with a lightweight student model gte‑multilingual‑rerank‑base. Distillation signals are injected after each transformer block; optional rank‑model pruning balances latency and accuracy.

4. Summarization

The system streams XML‑structured results so the front‑end can render partial outputs instantly, improving perceived latency.

AI Deep Search 2.0 – Multi‑Turn & Multimodal Interaction

Multi‑turn conversation: a Query‑Rewrite Agent merges successive user inputs into a single refined query.

Dynamic routing: a BERT‑based intent router decides between fast (traditional) search and deep (generative) search based on query complexity.

Multimodal retrieval: an image‑plus‑text pair is fed to Qwen2.5‑VI to generate a textual description, then to a GME multimodal embedding model to produce a joint vector for similarity search.

End‑to‑End Generative Retrieval (Model‑as‑Search‑Engine)

The pipeline treats each product as a discrete codebook sequence. Training proceeds in three stages:

Product‑to‑codebook alignment: continued pre‑training of a large language model maps product tokens into the model’s semantic space.

Query‑to‑codebook generation: at inference the model directly decodes one or more codebook sequences that match the user query.

Prefix matching: generated codebooks are matched against an indexed catalog to retrieve the corresponding items.

Initial experiments with RQ‑VAE codebooks produced high entropy residual clusters, hurting causal modeling. Replacing RQ‑VAE with hierarchical K‑means clustering yielded clearer prefix encodings and markedly better generation quality.

Empirical results show strong performance on many queries (e.g., "women's little‑black‑dress"), but hallucinations appear on fine‑grained attributes (e.g., incorrect cartoon pattern), indicating current limitations.

Extending the approach to scenario‑based recommendation enables the model to generate complete shopping lists for complex intents such as "winter trip to Harbin – what should I wear?".

Future Directions

To break the embedding quality ceiling, a three‑stage end‑to‑end quantization‑aware training aligns the encoder, Q‑Former, and projection layers, allowing the codebook to surpass raw embedding limits and unlock higher retrieval performance.

While generative retrieval has achieved notable progress, further research is required to eliminate hallucinations, improve fine‑grained accuracy, and fully replace traditional retrieval pipelines.

AILarge Language ModelE-commerce SearchGenerative Retrieval
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.