Databases 16 min read

How JD.com Scaled POP Order Elasticsearch to Handle Billions of Orders

This article analyzes the challenges of JD.com's POP order Elasticsearch storage—including data skew, oversized shards, frequent updates, and high maintenance costs—and details the multi‑layered architectural redesign that introduced tenant isolation, dual‑hash routing, differentiated shard strategies, and a dual‑active physical foundation to achieve high performance, scalability, and availability.

JD Retail Technology
JD Retail Technology
JD Retail Technology
How JD.com Scaled POP Order Elasticsearch to Handle Billions of Orders

Introduction

With JD.com’s rapid business growth, the number of platform merchants and order volume surged dramatically, especially after the 2025 push into local life services. The POP order Elasticsearch (ES) storage faced severe pressure, prompting an urgent system architecture upgrade.

Current System Overview

The POP order ES system originally served merchant fulfillment queries (non‑C‑end). It consisted of write services consuming various upstream messages (order pipelines, binlog, OFW changes, reconciliation messages) to update ES, and read services providing merchant‑level order lists and details.

Initially, only paid, fulfillable orders were indexed, but over time additional order types (pre‑sale, split, etc.) and un‑paid orders were added, increasing data diversity. Open API demands grew, requiring fewer interfaces to retrieve more data, leading to a heterogeneous system that also incorporated invoice, promise, after‑sale, and rating statuses, further stressing ES storage.

Core Pain Points

Data skew : Merchant‑level routing caused a few large merchants (e.g., JD Xi Self‑Operated) to occupy up to 25% of total order volume, resulting in TB‑scale shards and performance volatility.

Growing data volume : Shard sizes exceeded ES’s recommended 50‑100 GB by up to 50×, with some shards over 1 TB.

Increasing update frequency : Up to 300 k updates per minute, causing update conflicts and latency.

High maintenance cost : Bi‑annual migration of >6‑month‑old data from hot to cold clusters required complex DUCC changes, data comparison, and gray‑scale traffic switching.

Solution Overview

The redesign combines tenant‑level isolation, dual‑hash routing, differentiated shard strategies, and dual‑active physical infrastructure to build a high‑performance, highly‑scalable, and highly‑available order retrieval platform.

3.1 Resolving Data Skew

Physical isolation : Deploy dedicated clusters for large merchants, creating separate indices with shard counts tuned to their volume, and add virtual routing logic to direct large‑merchant traffic to these clusters.

Flexible routing strategy : Extend the proxy layer to support order‑level routing (instead of merchant‑level) for large‑merchant clusters, ensuring even data distribution across shards.

3.2 Addressing Oversized Shards

ES recommends single shard sizes of 50‑100 GB and a maximum of 100 nodes per cluster. To stay within these limits, the hot cluster’s ordinary merchant data was expanded from one to three ES clusters, with the proxy layer hashing merchant IDs to distribute data evenly.

3.3 Reducing Frequent Update Pressure

The system originally used ES’s optimistic lock update flow (read version → set fields → save). To lower update load, a buffering layer aggregates messages before applying them, decreasing the number of ES writes and mitigating conflicts.

3.4 Automating Data Maintenance

Two‑stage migration was implemented:

Stage 1: Use reindex to move data, then replay archival messages to catch up, cutting migration time from ~20 days to ~10 days.

Stage 2: Create a daily archival ES index as a cache, store daily order changes, and run a scheduled task to batch read, compare, write, and delete original data, achieving fully automated migration without manual intervention.

3.5 Final Architecture

The final solution delivers a high‑performance, highly‑scalable, and highly‑available enterprise‑grade order search and analysis platform, fully supporting explosive business growth.

Appendix

4.1 Challenges of Maintaining Ultra‑Large Clusters

Elasticsearch’s peer‑to‑peer architecture requires every node to store the full cluster state. When node count exceeds 100, broadcasting the cluster state leads to network storms, slow convergence, and increased risk of Full GC pauses.

4.2 ES Update Mechanics

Updates follow a “delete + insert” model:

Read : Retrieve the old document (or full source for partial updates).

Soft delete : Mark the old document as deleted in the segment.

Insert : Write the new document as a fresh entry with a new version.

This process incurs significant CPU for re‑analysis and indexing, not just I/O.

4.3 Pressures from Frequent Updates

Disk I/O and segment merge storms : Each update creates a small segment; merges generate heavy I/O and can throttle writes.

Query performance degradation : Deleted documents (“tombstones”) remain until merged, increasing scan overhead and memory usage.

Cache invalidation : Frequent segment changes invalidate filter caches, leading to low cache hit rates.

GC pressure : Update operations generate many temporary objects, causing frequent Young GC and risking Full GC pauses.

ElasticsearchScalingOrder ManagementData Partitioning
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.