Artificial Intelligence 10 min read

Reliable Advertising Creative Generation and Personalized Recommendation via Multimodal Feedback and Offline Representation

The article presents a series of technical breakthroughs by JD's advertising team that improve the quality and coverage of AI‑generated ad images through a trustworthy multimodal feedback network, introduce a large human‑annotated image dataset, and enhance creative ranking with offline multimodal representations and online architecture optimizations, ultimately achieving more precise and scalable ad personalization.

JD Tech
JD Tech
JD Tech
Reliable Advertising Creative Generation and Personalized Recommendation via Multimodal Feedback and Offline Representation

High‑quality advertising creatives are crucial for e‑commerce success, yet manual design is costly and AI‑generated images often suffer from low quality, requiring extensive human review. JD's advertising team addressed this by proposing a reliable multimodal feedback network (RFNet) that simulates human inspection, dramatically increasing the usable rate of generated images while preserving visual appeal.

The team also released the industry‑first RF1M dataset, containing over one million human‑annotated AI‑generated ad images, to facilitate more realistic model training and evaluation.

For creative selection, a multimodal large language model extracts rich representations from both images and copy, improving the discriminative power and cold‑start performance of the ranking model. The selection task is split into element selection and combination selection stages, enabling the model to handle a vast variety of creative assets.

To further enhance representation, the team built offline multimodal embeddings using MLLM techniques, incorporating explicit features (e.g., NER, background color, logos) and implicit cues (e.g., promotional status, target audience). Contrastive learning with MOCO v3 and a retrieval‑based quality evaluation pipeline (Fassi) were employed to refine these embeddings.

The online ranking architecture was redesigned to alleviate combinatorial explosion and data sparsity. Candidate creatives are now incorporated into the ranking model, and the objective was upgraded from pure CTR estimation to a list‑wise formulation that jointly optimizes exposure and click prediction across all candidates.

Finally, a joint training paradigm separates the <user, sku> prediction from the creative ordering, allowing a lightweight creative model to be deployed online, reducing serving pressure while maintaining high relevance.

These innovations collectively improve the coverage and quality of AI‑generated advertising materials, resolve data sparsity and combinatorial challenges, and boost online creative performance.

References: [1] Parallel Ranking of Ads and Creatives in Real‑Time Advertising Systems, AAAI 2024. [2] Towards Reliable Advertising Image Generation Using Human Feedback, ECCV 2024. [3] CBNet: A Plug‑and‑Play Network for Segmentation‑Based Scene Text Detection, IJCV 2024. [4] Generate E‑commerce Product Background by Integrating Category Commonality and Personalized Style, ICASSP 2025.

advertisingrecommendationAImultimodalAIGCcreative generation
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.