JD Retail's Jiushu Business Analytics Platform: AI‑Driven Solutions for Retail
The article introduces JD Retail's Jiushu Business Analytics Platform, detailing how AI, big‑data, and distributed‑training technologies address fragmented retail scenarios, high deployment barriers, large‑scale application difficulties, and cost concerns through specialized frameworks, fault‑tolerant training, and advanced cluster optimization.
In traditional industries, massive data is generated that hides valuable information; artificial‑intelligence algorithms can mine this knowledge to guide decisions, improve data intelligence, and boost efficiency. JD Retail leverages AI to enhance shopping experience, marketing efficiency, and technical empowerment.
The Jiushu Business Analytics Platform serves JD Retail's core algorithmic services, tackling four main challenges: (1) fragmented scenario requirements across recommendation, advertising, risk control, etc.; (2) high algorithm deployment barriers due to complex distributed tasks; (3) difficulty scaling to massive data and models; (4) high talent, compute (GPU), and storage costs.
To solve these, Jiushu provides a full‑stack capability from infrastructure to business algorithms, including multiple self‑developed frameworks: 9N‑Deep (deep learning), 9N‑FL (federated learning), 9N‑RL (reinforcement learning), 9N‑OL (online learning), and the large‑scale graph computing framework Galileo.
1. Large‑Scale Graph Computing – Galileo addresses data scale, modeling difficulty, slow joint training, and memory consumption by supporting billions of nodes/edges, offering distributed graph expression, computation, and storage.
2. Training Component Fault Tolerance unifies data management, introduces a custom checkpoint mechanism, and leverages Kubernetes for automatic recovery, dramatically improving training success rates and reducing engineer time and hardware waste.
3. Cluster Optimization manages heterogeneous resources (CPU, GPU, storage, network) via Kubernetes, applying strategies such as single‑task affinity, heterogeneous mixed deployment, hot‑node anti‑affinity, and multi‑strategy mixing, achieving up to 57% training speedup in JD Retail.
Additional innovations include adaptive quota control for fair resource allocation, and a HostNetwork mode for large‑scale training that bypasses container network bottlenecks, delivering roughly 20% performance gains.
Future plans for Jiushu involve expanding algorithm solutions across all retail business lines, intelligent cluster upgrades, continual performance breakthroughs, and enhanced usability for algorithm developers.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.