Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency
The article details how JD's advertising retrieval platform tackles the core challenge of balancing limited compute resources with massive data by optimizing compute allocation, improving model scoring efficiency, and enhancing iteration speed through distributed execution graphs, adaptive algorithms, and platform‑level infrastructure improvements.
JD's advertising retrieval platform serves billions of users and millions of merchants, requiring efficient matching of ads to user intent while handling an enormous product pool. The core difficulty lies in balancing limited compute capacity with massive data volume.
System Overview
The platform converts advertiser demands into a language understood by the playback system and performs an initial user‑item‑scene match. From a search space of billions, it returns hundreds of items, considering user experience, advertiser goals, relevance, and platform revenue.
Core Technical Challenges
Scoring functions evolved from simple rule‑based scores (binary) to deep‑learning‑driven vector scores, increasing computational cost per candidate. As models become more complex, the waste of compute on irrelevant candidates becomes significant.
Three Main Optimization Directions
1. Compute Allocation : Save compute and latency for compute‑intensive stages.
2. Compute Optimization : Improve relevance scoring accuracy to increase business value per unit compute.
3. Iteration Efficiency : Provide a one‑stop experiment platform to accelerate feature rollout.
Main Line 1 – Distributed Execution Graph
To achieve adaptive compute distribution, JD built a data‑driven distributed execution graph that models dependencies between operators (OPs). The graph classifies OPs into no‑dependency, local‑dependency, and full‑dependency, enabling parallel execution where possible and reducing wasted compute. This upgrade cut retrieval latency by over 16%.
Main Line 2 – Adaptive Retrieval Engine
The engine evolved through four stages:
Stage 1: Dual‑tower ANN with basic tree indexes.
Stage 2: Real‑time data freshness, reducing material update latency to minutes.
Stage 3: Business‑aware hierarchical indexing that partitions vectors by user intent, dramatically shrinking candidate sets.
Stage 4: Full‑library PQ indexing and deep indexing (EM‑based) that break the dual‑tower constraint, allowing richer representations and joint user‑item modeling.
These advances enable high‑throughput, low‑latency recall while supporting arbitrary business objectives (e.g., maximizing eCPM instead of CTR) by simple vector augmentation.
Main Line 3 – Platform Infrastructure
JD modularized the system into atomic operators (OPs) with clear I/O, traceability, and configurability. A new three‑layer configuration model (Key, Condition, Value) decouples business logic from system code, allowing one‑click A/B experiments on any setting. Debug and Trace modes provide real‑time, end‑to‑end visibility for developers and operators.
Conclusion and Outlook
By integrating adaptive compute allocation, efficient retrieval algorithms, and a robust platform foundation, JD’s advertising retrieval system maximizes business value per compute unit. Future work will continue to push the limits of compute efficiency, retrieval performance, and rapid iteration.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.