Operations 11 min read

Design and Implementation of JD.com's Multi‑Active Distributed Architecture

This article details JD.com's multi-active distributed architecture, covering its evolution from single‑data‑center to multi‑region deployments, network design, leaf‑spine topology, data consistency mechanisms, application scheduling, monitoring, and disaster recovery strategies that enhance high availability and user experience.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Design and Implementation of JD.com's Multi‑Active Distributed Architecture

JD.com’s journey from a single Beijing data center to a multi‑region, multi‑active architecture is described, highlighting the business drivers that require high availability, disaster‑recovery capability, and scalable network design as enterprises mature through startup, growth, and mature phases.

The need for geographically distributed active‑active sites stems from limited resources in a single IDC (cabinet, bandwidth, power) and the desire to improve user experience across China’s three major carriers. The solution places services close to users, avoids cross‑network traffic, and leverages BGP, GSLB, and intelligent DNS for optimal routing.

The network backbone adopts a leaf‑spine architecture with three‑layer BGP inter‑connection, eliminating Layer‑2 limitations. Each POD consists of ~200 racks, interconnected via high‑speed edge links, supporting horizontal scaling and Docker‑native automation through JDOS.

Data consistency is achieved by limiting replicas to three copies across at least two centers, using JD’s proprietary high‑speed synchronization product Tube (based on RBR binary‑log capture) and the JIMDB cache for asynchronous, near‑real‑time replication. Critical services rely on these layers to hide disaster‑recovery and latency concerns from developers.

Application deployment follows a closed‑loop model: services are locally deployed with dependencies satisfied, traffic is balanced across the three regional centers, and scheduling uses HTTPDNS and GSLB to direct users to the best‑performing site. The architecture also supports CDN‑like “just‑in‑time” user placement for seamless failover.

Monitoring spans nationwide network quality APIs feeding GSLB decisions, as well as fine‑grained intra‑center health checks on devices, links, and servers. In the event of catastrophic failures (power, cooling, network), an emergency multi‑active plan enables rapid traffic switchover while preserving data integrity through cache‑based consistency.

In summary, JD.com’s multi‑active strategy improves high availability, disaster resilience, and user experience by reducing cross‑center calls, enforcing closed‑loop read/write principles, guaranteeing eventual data consistency, and applying tailored solutions per business need.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsnetwork architectureOperationsData Consistencymulti-activecloud infrastructure
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.