How JD’s Order Fulfillment Center Scales to Millions of Orders During Mega‑Sales
This article explains how JD.com’s Order Fulfillment Center (OFC) was built, re‑engineered, and continuously optimized to handle massive order volumes during major sales events, covering its architecture, migration from .Net to Java, distributed task queues, flow control, and operational practices that ensure reliability and scalability.
OFC Importance
During the 2014 "618" shopping festival, JD faced unprecedented order volumes that required a robust system to transform user orders into production orders for warehouses. The Order Fulfillment Center (OFC) links the user‑ordering front‑end with downstream warehouse systems, handling order splitting, transfer, and downstream transmission.
OFC was formed in 2003 as JD’s business grew, and by 2011 a dedicated team handled data transmission between dozens of upstream and downstream systems, including non‑customer orders such as procurement and supplier orders.
Formation of OFC
Rapid growth of JD’s e‑commerce platform created many new subsystems. A small team was created to transfer order data to warehouses and to route non‑customer orders to appropriate business systems, establishing OFC.
Early on, system inconsistencies caused orders to stall at this stage, requiring deep knowledge of upstream and downstream business processes to diagnose and resolve issues.
Technical Refactoring
.Net to Java
The original .Net systems were rewritten in Java to align with JD’s technology strategy. After a month of development, the new Java version was gradually rolled out using regional traffic switches and fully launched by February 2012.
1. Order splitting: dividing a user order into sub‑orders based on warehouse distribution. 2. Order transfer: moving split orders to downstream systems based on inventory and other attributes. 3. Order downstream and feedback: invoking warehouse services for pre‑packing, production, and finally sending order data back to the front‑end.
211 Order Fulfillment Rate Improvement Project
To reduce the time from order placement to warehouse processing (target <5 minutes), JD upgraded eleven systems, including order transaction, pipeline, split, transfer, task, OFC, pre‑sorting, label, tax, invoicing, and WMS. The effort involved 5,066 person‑hours and introduced technologies such as Zookeeper, CXF timeouts, Log4j multi‑Tomcat configuration, Oracle Exadata, and MySQL partitioning.
Key outcomes: a new OCS service for order amount calculation, decoupling split logic and persisting allocation results, now used by over twenty downstream systems.
SOP Hinge Order Project
When a shopping cart contains both JD‑self‑operated and POP merchant items, a single checkout is required. The team built a split service to meet sub‑second TP99 requirements, employing in‑memory processing, reduced external dependencies, asynchronous handling, and graceful degradation.
A deduplication mechanism using a “repeat‑check” database prevented duplicate order submissions caused by network timeouts.
Transfer Architecture Upgrade
The transfer system was overhauled to make business and data processing asynchronous, parallelize data handling, cache hot data, and implement traffic smoothing to protect downstream systems during peaks.
Operational improvements included configurable rollout, risk mitigation tools, and a focus on reducing unnecessary workflow steps.
Operations Must‑Love Maintenance
Ticket volume dropped from thousands to dozens per day after system stabilisation, reflecting improved health and mature operational processes.
From 618 to Double‑11
Order volumes grew exponentially, demanding a shift from database‑centric control to a workflow‑centric control plane, using state machines to ensure data consistency and reduce bottlenecks.
Supporting Massive Order Processing
Scalability is achieved by horizontal scaling of core services, cluster‑level expansion, and distributed deployment, enabling rapid capacity increases for major sales events.
Resolving Data Consistency Issues
A unified main‑process flow and state machine now orchestrate order handling across logistics and finance domains, reducing data divergence and allowing focused investment in core systems.
Supporting Operations
Key operational practices include high‑availability core systems, proactive monitoring of queue depth and throughput, rapid incident localisation, and a SOA governance platform to map system dependencies and enforce SLAs.
Big Data Beginnings
General Principles
Order processing differs from transaction processing: occasional latency is acceptable, but average throughput must handle massive data volumes while ensuring consistency.
System Protection
Selective flow control and traffic shaping protect downstream services from overload, complemented by unified capacity monitoring and fast‑reject mechanisms.
Distributed System Design
Each processing group can be scaled horizontally or as a cluster, with independent or collective deployment. A distributed task queue (Redis) shards tasks, processes them in parallel, and routes failures to an exception queue for retry, ensuring no data loss.
The task engine uses Zookeeper for dynamic workflow configuration, automatically throttling or accelerating throughput based on real‑time health signals, and prioritising high‑value orders.
System deployment diagrams illustrate the layered, horizontally scalable architecture supporting JD’s massive e‑commerce order flow.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
