Operations 13 min read

Design and Implementation of JD Daojia Open Platform Message System (BMQ)

This article explains the architecture, reliability mechanisms, dynamic configuration, monitoring, and alerting strategies of JD Daojia Open Platform's Business Message Queue (BMQ), illustrating how the system handles bidirectional communication, fault isolation, and scalable message processing for merchants.

Dada Group Technology
Dada Group Technology
Dada Group Technology
Design and Implementation of JD Daojia Open Platform Message System (BMQ)

JD Daojia Open Platform is an integration platform that provides services to merchants and third‑party developers, acting as a bridge between merchant data and the platform's delivery data.

The platform offers many callable APIs and requires merchants to provide interfaces for the platform to invoke, such as new order creation, order status changes, store information updates, and promotion approvals.

While the platform‑provided APIs are relatively stable, merchant‑provided notification interfaces are uncertain; issues like service downtime, network interruptions, and retry strategies must be considered from the start of the messaging system design.

Core Design of the Messaging System

The Business Message Queue (BMQ) is built on JD's internal MQ middleware, which is subscribed by both internal business systems and the open platform. After format conversion (standard or non‑standard), messages are sent to merchants.

JD's self‑developed MQ provides high availability and data reliability, enabling automatic retry, controlled push frequency, retry duration, and lossless message delivery.

When a merchant's endpoint fails, continuous retries can cause backlog that affects other merchants. To isolate impact, the platform separates key‑account (KA) merchants from regular merchants, giving KA merchants dedicated channels that can be expanded as needed.

However, non‑KA merchants can still affect each other as traffic grows. To further isolate, each message type gets its own retry channel, ensuring problematic messages do not block normal processing.

In practice, a scenario arose where the regular channel was heavily back‑logged while the retry channel remained empty because the HTTP client timeout was set to 3 seconds; slow merchant responses (~2 seconds) were not considered failures, preventing retries.

To address this, a degradation channel was added. Real‑time monitoring (Kafka + Flink) tracks response times per merchant per message; if thresholds are exceeded, messages are routed to the degradation channel instead of the regular channel. Once normal behavior resumes, the degradation flag is cleared.

The solution combines three channel types (regular, retry, degradation) with specific strategies to handle most real‑world scenarios while remaining extensible.

Message Order Guarantee

The platform does not enforce strict ordering; it relies on the MQ's inherent ordering guarantees. Merchants must implement deduplication on their side.

Alert Mechanism

Although the core design mitigates many risks, challenges remain, such as promptly notifying merchants of interface failures and residual cross‑impact among non‑KA merchants. An alert system based on statistical thresholds sends SMS notifications to merchant owners and platform engineers when continuous failures exceed defined limits.

When multiple merchants cause aggregate failures, a diagnostic tool allows operators to pinpoint the most problematic merchant and its failing interfaces, enabling quick disabling of problematic messages.

Operators can also manually stop alerts while keeping retry mechanisms active to avoid unnecessary SMS costs.

Dynamic Message Publishing

Previously, adding a new message required extensive Java code changes, leading to high maintenance cost. To simplify, the platform introduced a dynamic configuration approach where standard messages are defined via configurable rules (field mapping, defaults, simple logic) through a UI.

Field configuration details are shown below.

The management UI supports add/modify/delete operations and dynamically subscribes or unsubscribes MQ topics without writing listeners, enabling fast rollback if needed.

Server‑side code for dynamic subscription is as follows:

// ips字段为空为广播所有机器处理,不为空为指定某些机器执行
if (StringUtils.isEmpty(dynamicLoadingMessage.getIps()) || dynamicLoadingMessage.getIps().contains(IPUtil.getIp())) {
  // type=0需要加载的mq,非0需要卸载的mq
  if (dynamicLoadingMessage.getType() == 0) {
    openPlatformBMQEngine.getMessageConsumer().unsubscribe(dynamicLoadingMessage.getTopic());
    openPlatformBMQEngine.getMessageConsumer().subscribe(dynamicLoadingMessage.getTopic(), commonBMQListener);
    LOGGER.error("DynamicLoadingListener->onMessage->subscribe->dynamicLoadingMessage:{}", JsonUtils.toJson(dynamicLoadingMessage));
  } else {
    openPlatformBMQEngine.getMessageConsumer().unsubscribe(dynamicLoadingMessage.getTopic());
    LOGGER.error("DynamicLoadingListener->onMessage->unsubscribe->dynamicLoadingMessage:{}", JsonUtils.toJson(dynamicLoadingMessage));
  }
}

After this processing, the publishing workflow is visualized below.

Summary of Dynamic BMQ Features

1. Platform‑wide messaging system.

2. Eliminates custom integration by using rule‑based dynamic parsing, providing a unified entry point for new messages.

3. Low‑risk integration: avoids manual code changes and deployment, reducing development, testing, and release effort from ~2 days to about 1 minute, eventually allowing non‑developers to onboard messages.

Conclusion

This article introduced how the BMQ system ensures stability, usability, and simplicity, combining practical solutions with theoretical considerations to meet current needs and future growth. Ongoing optimization will continue to improve the platform.

Accumulation, summarization, and optimization are the paths to further advancement.

It is an honor to join the JD Daojia Open Platform team and grow together.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendSystem DesignDynamic ConfigurationMessage QueueReliability
Dada Group Technology
Written by

Dada Group Technology

Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.