Big Data 11 min read

Evolution and Optimization of JD Daojia Order Center Elasticsearch Cluster

This article details how JD Daojia's order center migrated from a simple MySQL‑backed system to a high‑throughput Elasticsearch cluster, describing each architectural phase, performance tuning measures, dual‑cluster real‑time backup, version upgrades, data synchronization strategies, and the key pitfalls encountered such as deep pagination and FieldData memory issues.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Evolution and Optimization of JD Daojia Order Center Elasticsearch Cluster

JD Daojia's order center faced massive read‑heavy traffic, prompting a shift from MySQL to Elasticsearch to handle large‑scale order queries.

Initial stage: The cluster was deployed on elastic cloud with default settings, leading to single‑point failures and resource contention.

Isolation stage: To mitigate resource stealing, high‑resource nodes were moved to dedicated physical machines, improving stability.

Node replica tuning: Each ES node was placed on its own physical server, and the replica factor was increased from 1 primary + 1 replica to 1 primary + 2 replicas, boosting throughput.

Master‑slave adjustment: A standby cluster was introduced for failover; data is written to both clusters (primary and backup) using a dual‑write strategy, with Zookeeper controlling traffic switching.

Current real‑time dual‑cluster stage: The primary cluster was upgraded from ES 1.7 to 6.x, while the backup cluster stores recent hot data (≈10% of primary size). During upgrades, the backup temporarily serves all queries to ensure zero downtime.

Data synchronization: Two approaches were considered—MySQL binlog listening and direct ES API writes. The team chose direct API writes for simplicity and low latency, adding a compensation worker to retry failed updates and maintain eventual consistency.

Pitfalls encountered: (1) High‑real‑time queries still use MySQL due to ES refresh latency; (2) Deep pagination causes excessive shard processing and should be avoided; (3) FieldData memory pressure led to query timeouts, resolved by switching to Doc Values, which store data off‑heap.

Overall, continuous architectural iteration driven by rapid business growth has resulted in a scalable, high‑performance, and highly available Elasticsearch solution for order data.

dual writeperformance optimizationElasticsearchDocValuesdata synchronizationorder systemCluster Architecture
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.