Artificial Intelligence 22 min read

How LLMs Are Redefining Recommender Systems for JD Union Ads

This article surveys the impact of large language models on recommendation systems, outlines generative recommender architectures, discusses challenges of JD Union advertising, presents a semantic‑ID based solution with training and inference details, and reports offline and online experimental results.

JD Cloud Developers

Jun 13, 2024

How LLMs Are Redefining Recommender Systems for JD Union Ads

Background

Large language models (LLMs) are profoundly reshaping natural language processing and offer new possibilities for other fields. Recommendation systems (RS) mitigate information overload, and integrating LLMs into RS is a promising research direction.

Generative Recommender System

A generative recommender system directly generates recommendations or recommendation‑related content without calculating a ranking score for each candidate item.

Traditional RS handle billions of items through multi‑stage pipelines (recall, coarse ranking, fine ranking, re‑ranking). Because of latency constraints, complex algorithms cannot be applied to all items. LLMs can simplify this pipeline by generating items directly, moving from a discriminative multi‑stage approach to a single‑stage generative one, and they provide better generalization and stability, especially for cold‑start users and new domains.

JD Union Advertising

JD Union is an affiliate marketing platform that drives traffic via CPS ads. Recommendations for low‑activity users face challenges: data sparsity, cold‑start, scenario understanding, and maintaining diversity and novelty.

LLM‑Enhanced RS for JD Union

LLMs extract high‑quality textual representations and encode world knowledge, enabling better understanding of users, items, and context. They can alleviate data sparsity and cold‑start issues through zero‑ or few‑shot capabilities and transfer knowledge to unseen items and scenarios.

Four Core Stages of Generative RS

1. Item Representation – Items are represented by short token sequences (identifiers). Methods include:

Numeric ID: tokenized numeric identifiers, memory‑intensive and lacking semantics.

Textual Metadata: item titles or descriptions, rich in semantics but costly for long texts.

Semantic ID (SID): discretized vectors from models such as RQ‑VAE, combining semantic meaning with compact token sequences.

2. Model Input Representation – Consists of task description, user information (history and profile), and contextual/external information.

3. Model Training – Involves constructing text‑to‑text training pairs and optimizing conditional likelihood. Primary task: user → item identifier. Additional alignment tasks bridge SID and textual descriptions (SID↔text).

4. Model Inference – Generates item identifiers via autoregressive decoding. Two strategies:

Free generation: selects top‑K tokens from the full vocabulary, risking invalid identifiers.

Constrained generation: uses trie or FM‑index to restrict outputs to valid identifiers.

Representative Works

Practical Solution

Overall Design

The proposed framework adopts semantic‑ID item representation and aligns collaborative and textual signals through dual training tasks.

Key Modules

(1) Semantic ID Representation – Item titles are encoded with BERT‑base‑chinese (768‑dim) and Yi‑6B (4096‑dim). RQ‑VAE quantizes these vectors into token sequences; conflicts are handled either by allowing many‑to‑one mapping or by appending a random token to ensure uniqueness.

(2) Alignment Tasks – “Next Item Prediction” (user description → next item) and bidirectional SID↔title alignment to bridge the gap between discrete IDs and natural language.

Offline and Online Experiments

Training Data

{
    "instruction": "The user is female, age 46‑55, married, unknown children status. Clicked items: <a_112><b_238><c_33><d_113>, <a_73><b_50><c_228><d_128>, ... What is the next likely click?",
    "response": "<a_96><b_113><c_49><d_174>"
}

Additional alignment examples map SID to titles and vice‑versa.

Base Models and Training

Base models: Qwen1.5‑0.5B/1.8B/4B and Yi‑6B. New tokens for SIDs are added, and interaction data are used for fine‑tuning. Beam search with size 20 is employed for constrained decoding.

Evaluation

Offline metrics: HR@1,5,10; NDCG@1,5,10. Online metric: UCTR.

Results

• Larger models achieve higher scores; 0.5B struggles with multi‑task data. • Yi‑6B outperforms others, especially with instruction‑following fine‑tuning. • Compared with collaborative baselines, Yi‑6B shows superior performance on sparse data. • Online small‑traffic tests reveal generative models match or exceed traditional pipelines, improving UCTR by over 5% on certain pages, particularly in sparse‑data scenarios.

Future Optimization Directions

Improve data quality, fuse multimodal and collaborative signals, develop scalable SID training/inference frameworks, apply LoRA, mixed‑data strategies, model distillation, pruning, quantization, and explore query‑driven search‑recommend integration and generation of recommendation reasons.

We invite partners interested in advancing generative recommender technology to collaborate with us.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI LLM Recommendation Systems cold-start Semantic ID generative recommender

Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.