Design and Implementation of JD Daojia Open Platform Message System (BMQ)
This article explains the architecture, reliability mechanisms, dynamic configuration, monitoring, and alerting strategies of JD Daojia Open Platform's Business Message Queue (BMQ), illustrating how the system handles bidirectional communication, fault isolation, and scalable message processing for merchants.
JD Daojia Open Platform is an integration platform that provides services to merchants and third‑party developers, acting as a bridge between merchant data and the platform's delivery data.
The platform offers many callable APIs and requires merchants to provide interfaces for the platform to invoke, such as new order creation, order status changes, store information updates, and promotion approvals.
While the platform‑provided APIs are relatively stable, merchant‑provided notification interfaces are uncertain; issues like service downtime, network interruptions, and retry strategies must be considered from the start of the messaging system design.
Core Design of the Messaging System
The Business Message Queue (BMQ) is built on JD's internal MQ middleware, which is subscribed by both internal business systems and the open platform. After format conversion (standard or non‑standard), messages are sent to merchants.
JD's self‑developed MQ provides high availability and data reliability, enabling automatic retry, controlled push frequency, retry duration, and lossless message delivery.
When a merchant's endpoint fails, continuous retries can cause backlog that affects other merchants. To isolate impact, the platform separates key‑account (KA) merchants from regular merchants, giving KA merchants dedicated channels that can be expanded as needed.
However, non‑KA merchants can still affect each other as traffic grows. To further isolate, each message type gets its own retry channel, ensuring problematic messages do not block normal processing.
In practice, a scenario arose where the regular channel was heavily back‑logged while the retry channel remained empty because the HTTP client timeout was set to 3 seconds; slow merchant responses (~2 seconds) were not considered failures, preventing retries.
To address this, a degradation channel was added. Real‑time monitoring (Kafka + Flink) tracks response times per merchant per message; if thresholds are exceeded, messages are routed to the degradation channel instead of the regular channel. Once normal behavior resumes, the degradation flag is cleared.
The solution combines three channel types (regular, retry, degradation) with specific strategies to handle most real‑world scenarios while remaining extensible.
Message Order Guarantee
The platform does not enforce strict ordering; it relies on the MQ's inherent ordering guarantees. Merchants must implement deduplication on their side.
Alert Mechanism
Although the core design mitigates many risks, challenges remain, such as promptly notifying merchants of interface failures and residual cross‑impact among non‑KA merchants. An alert system based on statistical thresholds sends SMS notifications to merchant owners and platform engineers when continuous failures exceed defined limits.
When multiple merchants cause aggregate failures, a diagnostic tool allows operators to pinpoint the most problematic merchant and its failing interfaces, enabling quick disabling of problematic messages.
Operators can also manually stop alerts while keeping retry mechanisms active to avoid unnecessary SMS costs.
Dynamic Message Publishing
Previously, adding a new message required extensive Java code changes, leading to high maintenance cost. To simplify, the platform introduced a dynamic configuration approach where standard messages are defined via configurable rules (field mapping, defaults, simple logic) through a UI.
Field configuration details are shown below.
The management UI supports add/modify/delete operations and dynamically subscribes or unsubscribes MQ topics without writing listeners, enabling fast rollback if needed.
Server‑side code for dynamic subscription is as follows:
// ips字段为空为广播所有机器处理,不为空为指定某些机器执行
if (StringUtils.isEmpty(dynamicLoadingMessage.getIps()) || dynamicLoadingMessage.getIps().contains(IPUtil.getIp())) {
// type=0需要加载的mq,非0需要卸载的mq
if (dynamicLoadingMessage.getType() == 0) {
openPlatformBMQEngine.getMessageConsumer().unsubscribe(dynamicLoadingMessage.getTopic());
openPlatformBMQEngine.getMessageConsumer().subscribe(dynamicLoadingMessage.getTopic(), commonBMQListener);
LOGGER.error("DynamicLoadingListener->onMessage->subscribe->dynamicLoadingMessage:{}", JsonUtils.toJson(dynamicLoadingMessage));
} else {
openPlatformBMQEngine.getMessageConsumer().unsubscribe(dynamicLoadingMessage.getTopic());
LOGGER.error("DynamicLoadingListener->onMessage->unsubscribe->dynamicLoadingMessage:{}", JsonUtils.toJson(dynamicLoadingMessage));
}
}After this processing, the publishing workflow is visualized below.
Summary of Dynamic BMQ Features
1. Platform‑wide messaging system.
2. Eliminates custom integration by using rule‑based dynamic parsing, providing a unified entry point for new messages.
3. Low‑risk integration: avoids manual code changes and deployment, reducing development, testing, and release effort from ~2 days to about 1 minute, eventually allowing non‑developers to onboard messages.
Conclusion
This article introduced how the BMQ system ensures stability, usability, and simplicity, combining practical solutions with theoretical considerations to meet current needs and future growth. Ongoing optimization will continue to improve the platform.
Accumulation, summarization, and optimization are the paths to further advancement.
It is an honor to join the JD Daojia Open Platform team and grow together.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Dada Group Technology
Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
