How LLMs Are Redefining Recommender Systems for JD Union Ads
This article surveys the impact of large language models on recommendation systems, outlines generative recommender architectures, discusses challenges of JD Union advertising, presents a semantic‑ID based solution with training and inference details, and reports offline and online experimental results.
Background
Large language models (LLMs) are profoundly reshaping natural language processing and offer new possibilities for other fields. Recommendation systems (RS) mitigate information overload, and integrating LLMs into RS is a promising research direction.
Generative Recommender System
A generative recommender system directly generates recommendations or recommendation‑related content without calculating a ranking score for each candidate item.
Traditional RS handle billions of items through multi‑stage pipelines (recall, coarse ranking, fine ranking, re‑ranking). Because of latency constraints, complex algorithms cannot be applied to all items. LLMs can simplify this pipeline by generating items directly, moving from a discriminative multi‑stage approach to a single‑stage generative one, and they provide better generalization and stability, especially for cold‑start users and new domains.
JD Union Advertising
JD Union is an affiliate marketing platform that drives traffic via CPS ads. Recommendations for low‑activity users face challenges: data sparsity, cold‑start, scenario understanding, and maintaining diversity and novelty.
LLM‑Enhanced RS for JD Union
LLMs extract high‑quality textual representations and encode world knowledge, enabling better understanding of users, items, and context. They can alleviate data sparsity and cold‑start issues through zero‑ or few‑shot capabilities and transfer knowledge to unseen items and scenarios.
Four Core Stages of Generative RS
1. Item Representation – Items are represented by short token sequences (identifiers). Methods include:
Numeric ID: tokenized numeric identifiers, memory‑intensive and lacking semantics.
Textual Metadata: item titles or descriptions, rich in semantics but costly for long texts.
Semantic ID (SID): discretized vectors from models such as RQ‑VAE, combining semantic meaning with compact token sequences.
2. Model Input Representation – Consists of task description, user information (history and profile), and contextual/external information.
3. Model Training – Involves constructing text‑to‑text training pairs and optimizing conditional likelihood. Primary task: user → item identifier. Additional alignment tasks bridge SID and textual descriptions (SID↔text).
4. Model Inference – Generates item identifiers via autoregressive decoding. Two strategies:
Free generation: selects top‑K tokens from the full vocabulary, risking invalid identifiers.
Constrained generation: uses trie or FM‑index to restrict outputs to valid identifiers.
Representative Works
Practical Solution
Overall Design
The proposed framework adopts semantic‑ID item representation and aligns collaborative and textual signals through dual training tasks.
Key Modules
(1) Semantic ID Representation – Item titles are encoded with BERT‑base‑chinese (768‑dim) and Yi‑6B (4096‑dim). RQ‑VAE quantizes these vectors into token sequences; conflicts are handled either by allowing many‑to‑one mapping or by appending a random token to ensure uniqueness.
(2) Alignment Tasks – “Next Item Prediction” (user description → next item) and bidirectional SID↔title alignment to bridge the gap between discrete IDs and natural language.
Offline and Online Experiments
Training Data
{
"instruction": "The user is female, age 46‑55, married, unknown children status. Clicked items: <a_112><b_238><c_33><d_113>, <a_73><b_50><c_228><d_128>, ... What is the next likely click?",
"response": "<a_96><b_113><c_49><d_174>"
}Additional alignment examples map SID to titles and vice‑versa.
Base Models and Training
Base models: Qwen1.5‑0.5B/1.8B/4B and Yi‑6B. New tokens for SIDs are added, and interaction data are used for fine‑tuning. Beam search with size 20 is employed for constrained decoding.
Evaluation
Offline metrics: HR@1,5,10; NDCG@1,5,10. Online metric: UCTR.
Results
• Larger models achieve higher scores; 0.5B struggles with multi‑task data. • Yi‑6B outperforms others, especially with instruction‑following fine‑tuning. • Compared with collaborative baselines, Yi‑6B shows superior performance on sparse data. • Online small‑traffic tests reveal generative models match or exceed traditional pipelines, improving UCTR by over 5% on certain pages, particularly in sparse‑data scenarios.
Future Optimization Directions
Improve data quality, fuse multimodal and collaborative signals, develop scalable SID training/inference frameworks, apply LoRA, mixed‑data strategies, model distillation, pruning, quantization, and explore query‑driven search‑recommend integration and generation of recommendation reasons.
We invite partners interested in advancing generative recommender technology to collaborate with us.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
