Artificial Intelligence 24 min read

Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency

The article details how JD's advertising retrieval platform tackles the core challenge of balancing limited compute resources with massive data by optimizing compute allocation, improving model scoring efficiency, and enhancing iteration speed through distributed execution graphs, adaptive algorithms, and platform‑level infrastructure improvements.

JD Tech
JD Tech
JD Tech
Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency

JD's advertising retrieval platform serves billions of users and millions of merchants, requiring efficient matching of ads to user intent while handling an enormous product pool. The core difficulty lies in balancing limited compute capacity with massive data volume.

System Overview

The platform converts advertiser demands into a language understood by the playback system and performs an initial user‑item‑scene match. From a search space of billions, it returns hundreds of items, considering user experience, advertiser goals, relevance, and platform revenue.

Core Technical Challenges

Scoring functions evolved from simple rule‑based scores (binary) to deep‑learning‑driven vector scores, increasing computational cost per candidate. As models become more complex, the waste of compute on irrelevant candidates becomes significant.

Three Main Optimization Directions

1. Compute Allocation : Save compute and latency for compute‑intensive stages.

2. Compute Optimization : Improve relevance scoring accuracy to increase business value per unit compute.

3. Iteration Efficiency : Provide a one‑stop experiment platform to accelerate feature rollout.

Main Line 1 – Distributed Execution Graph

To achieve adaptive compute distribution, JD built a data‑driven distributed execution graph that models dependencies between operators (OPs). The graph classifies OPs into no‑dependency, local‑dependency, and full‑dependency, enabling parallel execution where possible and reducing wasted compute. This upgrade cut retrieval latency by over 16%.

Main Line 2 – Adaptive Retrieval Engine

The engine evolved through four stages:

Stage 1: Dual‑tower ANN with basic tree indexes.

Stage 2: Real‑time data freshness, reducing material update latency to minutes.

Stage 3: Business‑aware hierarchical indexing that partitions vectors by user intent, dramatically shrinking candidate sets.

Stage 4: Full‑library PQ indexing and deep indexing (EM‑based) that break the dual‑tower constraint, allowing richer representations and joint user‑item modeling.

These advances enable high‑throughput, low‑latency recall while supporting arbitrary business objectives (e.g., maximizing eCPM instead of CTR) by simple vector augmentation.

Main Line 3 – Platform Infrastructure

JD modularized the system into atomic operators (OPs) with clear I/O, traceability, and configurability. A new three‑layer configuration model (Key, Condition, Value) decouples business logic from system code, allowing one‑click A/B experiments on any setting. Debug and Trace modes provide real‑time, end‑to‑end visibility for developers and operators.

Conclusion and Outlook

By integrating adaptive compute allocation, efficient retrieval algorithms, and a robust platform foundation, JD’s advertising retrieval system maximizes business value per compute unit. Future work will continue to push the limits of compute efficiency, retrieval performance, and rapid iteration.

distributed systemsadvertisingdeep learningscalable architectureSearchANNcompute optimization
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.