Cloud Native 16 min read

How JD Built the World’s Largest Kubernetes Cluster to Support Trillion‑Scale E‑commerce Transactions

The article describes JD’s experience of redesigning Kubernetes at massive scale, detailing the JDOS2.0 platform, custom DNS and load‑balancing, the Archimedes scheduler, API and controller optimizations, and operational lessons learned from running tens of thousands of nodes in production.

JD Retail Technology

Jul 20, 2018

How JD Built the World’s Largest Kubernetes Cluster to Support Trillion‑Scale E‑commerce Transactions

Over the past year Kubernetes has become a de‑facto infrastructure standard, and JD has been operating its next‑generation container engine JDOS 2.0 since the end of 2016, migrating from OpenStack to a Kubernetes‑based PaaS that powers all online services, middleware, databases and offline big‑data jobs.

On June 28, JD’s infrastructure director Bao Yongcheng presented at Rancher Labs’ Container Day 2018, sharing how JD built the world’s largest Kubernetes cluster to support trillion‑level e‑commerce transactions.

JD’s data‑center architecture follows Google’s model, abstracting load‑balancing, DNS and the Kubernetes API into a unified layer, and deploying multiple Kubernetes clusters per site, each managing three physical pods of roughly 10 000 nodes.

To handle the scale, JD created its own distributed DNS based on etcd with a RESTful API that integrates with Kubernetes watches, later replacing it with a high‑performance DBTK service capable of 8 million queries per second, far surpassing bind9 and CoreDNS.

JD also rewrote the load‑balancer and controller logic, avoiding storing large config‑maps in etcd and applying aggressive caching, while enforcing strict testing before enabling any controller to prevent hidden inter‑controller impacts.

Recognizing that vanilla Kubernetes cannot meet JD’s resource‑utilization goals, the team introduced the Archimedes scheduler, which tightly packs workloads, enables local “rebuild” of containers to avoid scheduler bottlenecks, and prioritizes jobs based on business‑criticality.

Operational practices include reducing cluster complexity (a “subtraction” approach), avoiding API overload by not placing config‑maps in etcd, and implementing a custom eviction system that respects pod priority, tolerations and replica counts.

The presentation also highlights challenges such as API load spikes, node heartbeat reliability, and the need for external health‑check systems, as well as insights on CPU over‑provisioning, mixed online/offline workload scheduling, and the importance of modularizing Kubernetes components (CNI, CRI, CSI) for large‑scale stability.

In summary, JD’s experience shows that while Kubernetes provides a powerful foundation, massive production environments require extensive customizations, dedicated scheduling, and careful operational safeguards to achieve high availability and efficient resource usage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Scheduling Container Orchestration JDOS Large-Scale Clusters

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.