Artificial Intelligence 22 min read

How JD’s Advertising Lab Leverages Large‑Scale AI to Transform E‑Commerce Ads

JD's advertising research team combines deep learning, multimodal modeling, reinforcement‑learning auctions, and generative recommendation to boost ad relevance, improve long‑tail product exposure, and overcome large‑model inference challenges in a high‑traffic e‑commerce environment.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
How JD’s Advertising Lab Leverages Large‑Scale AI to Transform E‑Commerce Ads

JD Retail Advertising Department drives site‑wide traffic monetization and marketing effectiveness, with its R&D team applying cutting‑edge AI algorithms to massive user and merchant data, empowering millions of merchants and billions of consumers.

Flow Value Estimation – Better Understanding of User‑Item‑Context

Query Intent Understanding

Query intent recognition parses user search queries into categories, correcting errors, extracting entities, and rewriting queries to provide accurate signals for downstream recall, relevance, and ranking.

Generic terms with multiple intents (e.g., "fruit", "birthday gift")

Ambiguous terms with multiple intents (e.g., "Xiaomi" could refer to grain or phone)

Cold‑start long‑tail categories lacking exposure

Long‑tail queries with diverse expressions

Generation‑Matching Model for Long‑Tail Training Data

A generative‑matching pipeline pre‑trains a query generator from SKU titles/attributes and a matching model that scores generated queries against original titles, filtering low‑quality queries. The generated queries, labeled with the SKU’s category, augment training data to balance long‑tail categories.

These synthetic queries also support query suggestion and rewriting tasks.

Prior Knowledge Injection for Medium‑Tail Recall

To break the feedback‑loop bias toward high‑click categories, the system injects prior knowledge such as category semantics, co‑occurrence graphs, and GCN‑based encoders, training a BERT‑GCN hybrid with a semi‑supervised loss that matches query‑category semantics.

Multimodal Content Understanding

Multimodal Representation in Recall

A dual‑stream pipeline extracts text embeddings (BGE‑large‑zh1.5) from titles, brands, and categories, and visual embeddings (ViT‑CLIP‑base) from product images. Contrastive learning aligns the modalities, and a Gate‑GNN aggregates item‑item graphs to produce a unified multimodal product vector for recall.

Multimodal Representation in Creative Selection

Self‑supervised vision models (DINO) generate robust image embeddings without requiring object masks, enabling fine‑grained creative ranking that captures visual appeal and high‑order structured information.

Flow Selling Mechanism – ListVCG Reinforcement‑Learning Auction

ListVCG reformulates the combinatorial 700‑choose‑4 auction as a reinforcement‑learning problem, using an Actor‑Critic architecture where the Actor samples candidate permutations and the Critic evaluates them with real‑world feedback, iteratively improving the policy.

Multi‑Agent RL for Bidding and Mechanism

Separate bidding and mechanism agents are co‑trained; the bidding agent predicts optimal bid ratios, while the mechanism agent learns allocation and pricing policies. Offline simulation, reward shaping, and curriculum RL address sparse reward and environment mismatch.

Generative Recommendation for Advertising

The pipeline quantizes high‑click product titles into semantic IDs using RQ‑VAE, expands the token vocabulary of a large language model, and fine‑tunes it on bidirectional translation tasks between semantic IDs and product text.

Prompt: Please tell me the title of the product whose four‑tuple representation is {input_tuple}?
Input: <a_1><b_2><c_3><d_4>
Output: Huawei Mate60 Pro 16G+512GB White

Subsequent asymmetric prediction tasks generate the next product a user is likely to browse, using either semantic IDs or raw text as input.

Prompt: User’s historical product tuple sequence is {input_tuple1, …, input_tupleN}, predict the next product.
Input: <a_1><b_2><c_3><d_4>, …
Output: Huawei Mate60 Pro 16G+512GB White

Advertising Creative Generation

To improve AI‑generated ad creatives, the team proposes a Multimodal Reliable Feedback Network (RFNet) that automatically evaluates generated images, feeding back into a recursive generation loop. Consistent‑Condition regularization fine‑tunes diffusion models using RFNet scores, dramatically raising usable image rates. A 1M‑image labeled dataset (RF1M) supports training; the work appears at ECCV 2024.

Large‑Model Engineering for Advertising

Challenges include sub‑100 ms latency for 0.5‑72 B‑parameter models, high inference cost, and complex business pipelines. JD advertising has deployed 1.5 B models with token‑cost efficiency, optimized hardware topology, chip‑specific adaptations, distributed training, caching, and load‑balancing to support million‑QPS workloads.

e-commerceLarge Modelsmultimodalreinforcement learninggraph neural networkadvertising AI
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.