One4All Generative Recommendation Framework for CPS Advertising
This article reviews recent advances in applying large language models to CPS advertising recommendation, outlines business requirements and core technical challenges, proposes an extensible multi‑task generative framework with explicit intent perception and multi‑objective optimization, and presents offline and online performance gains along with future research directions.
After the remarkable achievements of large language models (LLMs) in natural language processing, the research community has been actively exploring how generative models can enhance search and recommendation systems. Existing work can be divided into two categories: (1) using LLMs for data and knowledge augmentation without modifying the model, and (2) directly adapting LLMs to model massive collaborative signals through pre‑training or fine‑tuning. The second category is a frontier direction for search‑recommendation scaling.
The JD Retail CPS algorithm team has conducted a series of works on generative recommendation, summarizing business needs and extracting core technical points. The main requirements include precise user intent perception, multi‑objective optimization to balance revenue and user activity, and compatibility with diverse scenarios and tasks.
Explicit Intent Perception for Controllable Product Recommendation – Traditional solutions either model user intent implicitly from behavior sequences, use trigger items for recall, or combine multiple networks (e.g., DIHN, DIAN, DEI2N, DUIN). These approaches have limitations in controllability and scalability. The proposed solution generates rich intent descriptions automatically, combines intent text with historical product semantic IDs as input, and predicts the target product ID, leveraging few‑shot prompting and chain‑of‑thought strategies with the Yanxi‑81B model, followed by self‑verification.
Multi‑Objective Optimization of Recommendation Effect – Non‑LLM methods such as Shared Bottom, MMOE, PLE, and ESMM address multi‑task balancing, while LLM‑based methods like MORLHF, MODPO, and Reward Soups align multiple reward functions. The article introduces a Rewards‑in‑Context (RiC) framework that incorporates various rewards (click, purchase, price, commission) into supervised fine‑tuning, enabling offline training with multi‑reward data and online training with Pareto‑frontier augmentation.
One4All Generative Recommendation Framework – To meet the diverse CPS advertising scenarios, the framework integrates behavior and semantic understanding, supports a wide range of tasks, and improves system generalization. It also defines an online model update strategy that selects the best model based on CVR and CTR changes, as shown in the code snippet below.
Model_DPO_T和Model_SFT_T进行比较,选出优胜模型A
if cvr+且ctr降低小于10%:
Model_A = Model_DPO_T
else:
Model_A = Model_SFT_T
Model_BEST_T-1和Model_A进行比较,选出优胜模型Model_BEST_T上线
if cvr+且ctr降低小于10%:
Model_BEST_T = Model_A
else:
Model_BEST_T = Model_BEST_T-1Results – Offline experiments show a 2–3× improvement in HitRate and NDCG for intent‑aware models. Online metrics such as SKUCTR, SKUCVR, same‑store orders, and commissions all see significant lifts (e.g., SKUCTR +3%+, SKUCVR +7%+). The framework now supports over 10 M daily UVs for real‑time inference.
Future Outlook – The article highlights the need for more interactive recommendation systems that jointly model search and recommendation, and for multimodal understanding of rich image and video signals in the front‑end pipeline to further boost performance.
References to recent papers and industry reports are provided to contextualize the work.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.