Backend Development 14 min read

Implementing Data Heterogeneity for JD Daojia Order Fulfillment: Architecture, Canal Integration, and Lessons Learned

This article examines JD Daojia's order fulfillment system, detailing the challenges of high‑volume prompt‑sound queries, the division of responsibilities among Redis, MySQL, and Elasticsearch, the adoption of Canal for asynchronous data replication, deployment practices with Kafka and Zookeeper, and the key operational lessons learned.

Dada Group Technology

May 21, 2021

Implementing Data Heterogeneity for JD Daojia Order Fulfillment: Architecture, Canal Integration, and Lessons Learned

Prompt Sound Business Background

The order fulfillment system of JD Daojia involves multiple parties (users, merchants, logistics) and a series of steps from payment to delivery. Merchants need a prompt‑sound feature to alert them of new orders, but during peak periods the Elasticsearch queries cause CPU spikes and service degradation.

Underlying Data Source Responsibilities

Different storage components serve distinct roles:

Redis : stores and queries batch tasks using Zset for recent task retrieval; not used for complex queries.

MySQL : persists order data, separating hot (active orders) and cold (historical orders) databases, with master‑slave replication for read scaling.

Elasticsearch : handles the majority of query load, with three clusters (HOT, FULL, and a dedicated Remind cluster) to isolate prompt‑sound traffic.

Data Write Complexity Issue

Adding a fourth ES cluster for prompt‑sound increased write complexity, prompting the evaluation of heterogeneous middleware solutions. Criteria included community activity, availability, and product maturity, leading to the selection of Canal as the preferred tool.

Canal Overview and Practice

Canal captures MySQL binlog changes, filters events, and forwards them to a store, which then pushes data to downstream systems such as Kafka. The workflow consists of three steps: Load&Store (binlog extraction), Send&Ack (delivery to Kafka), and Update MetaInfo (synchronizing offsets in Zookeeper).

Canal High Availability

Deployer HA relies on Zookeeper temporary nodes and retry mechanisms.

MySQL HA requires GTID mode to ensure consistent binlog positions across master and slave.

Deployment Practice

Two Deployer instances provide HA for data transfer, while Kafka buffers binlog events before they are consumed by adapters and written to the Remind ES cluster. Order IDs are hashed to maintain ordering within Kafka partitions, and Zookeeper stores Canal metadata for persistence.

Practical Issues & Summary

Issue 1 – Kafka Unavailability: caused ES data gaps and delayed order fulfillment; resolved by restoring Kafka and replaying missing data from Zookeeper checkpoints.

Issue 2 – Deployer Failure: automatic failover to standby Deployer prevented service interruption, highlighting the need for multi‑machine, multi‑region deployment for true HA.

Key takeaways: monitoring and alerting are essential for distributed pipelines; a fallback write path is necessary when exhaustive error scenarios cannot be enumerated; and redundancy at both machine and service levels is critical for high availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Architecture Elasticsearch Kafka order fulfillment Canal data heterogeneity

Written by

Dada Group Technology

Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.