Designing a Scalable After‑Sales System: Architecture, Distributed Locks, and ES Sync

This article explains the capabilities, positioning, and three‑tier architecture of JD Daojia's after‑sales system, detailing how it handles multi‑endpoint requests, distributed locking, promotion‑aware split data, Elasticsearch synchronization, combined return logistics, and accurate refund processing.

dbaplus Community
dbaplus Community
dbaplus Community
Designing a Scalable After‑Sales System: Architecture, Distributed Locks, and ES Sync

Introduction

By reading this article you will understand the essential capabilities an after‑sales system should provide, its role within the upstream and downstream ecosystem, the basic system architecture, and solutions to common business‑scenario problems.

Core Value

The JD Daojia after‑sales system operates as a reverse‑flow service tightly coupled with the JD Daojia business domain, covering four major scenarios: refund, return, exchange, and repair. It also supports appeal and arbitration for users and merchants and supplies reverse‑flow monetary data to billing and settlement systems.

The after‑sales system depends on several upstream and downstream services, as illustrated in the following diagram.

System Architecture

The system follows a classic three‑tier architecture. The application layer provides three entry points for different identities, the service layer offers business and data support, and the data layer uses MySQL, Redis, and Elasticsearch. Middleware such as an RPC framework, Zookeeper configuration center, distributed worker tasks, and JMQ messaging are also employed, alongside unified monitoring and log collection.

Business Forms

After a forward order is fulfilled, users can initiate after‑sale requests for missing, wrong, or defective items. Requests can originate from the user app, merchant portal, or customer‑service console. The flow varies based on responsibility: user‑initiated requests are routed to merchants or客服 for review, merchant‑initiated requests are automatically approved, and客服‑initiated requests follow the same path as user requests.

1. Applying for After‑Sale

Challenges

Concurrent multi‑endpoint operations may cause duplicate applications.

Retrieving split‑information for after‑sale items.

To prevent duplicate applications, a Redis distributed lock is used. The lock key combines a prefix with the order number, includes an expiration time to avoid deadlocks, waits for a configurable timeout, and uses a UUID token to ensure the lock is released only by its owner.

After acquiring the lock, the system assembles after‑sale order details from order and split data. It stores user, order, and merchant information in the main after‑sale table, copies selected SKU details into the after‑sale product table, and fetches split data using sku_promotionType (product + promotion type). For complex promotions, additional dimensions such as price and weight are added (e.g., sku_promotionType_price and sku_promotionType_price_weight).

Examples illustrate handling multiple A‑type items with different promotion types, bundle promotions with varying prices, and weight‑based refunds for pick‑error orders.

2. After‑Sale Review

As the volume of after‑sale orders grew, complex list queries caused slow MySQL performance and frequent alerts. The solution was to synchronize after‑sale data to Elasticsearch for list queries while still using MySQL for detail retrieval.

A feature flag controls whether queries hit MySQL or ES. Historical data is batch‑synced to ES using primary‑key ranges, and new data is streamed via MQ. Consistency is ensured by re‑reading the MySQL record for each MQ message and overwriting the ES document, guaranteeing the latest state regardless of processing order.

Potential data lag from MQ delays is mitigated by front‑end state validation before any write‑back operation.

Enable switch → batch sync → open switch → incremental MQ sync → count verification.

Consume MQ → query MySQL by primary key → full‑field upsert to ES.

Handle MQ delay by validating state before critical operations.

3. Return Logistics (Combined Shipping)

When multiple after‑sale return requests belong to the same order, a single logistics order should be created to reduce cost and improve user experience. A worker scans pending returns and, 10 minutes before the earliest scheduled pickup, triggers a combine task.

The task gathers all return‑eligible after‑sale orders for the same order, selects the one with the nearest pickup time, aggregates user info, product list, and total weight, and creates a shipment via an idempotent API (re‑calls return the same shipment number).

Failure handling records a retry flag; after exceeding a maximum retry count, the task is skipped and an alert is raised.

Combine worker scans pending returns → triggers combine task.

Task selects earliest pickup → assembles combined shipment.

Shipment API is idempotent.

Failures are logged, retried, and eventually suppressed with alerts.

4. Refund Processing

To guarantee refund accuracy, the system applies a distributed lock per after‑sale order, preventing duplicate refund audits. Each after‑sale order can be refunded only once, and lock acquisition failure aborts the operation with a business‑level error.

Line‑item validation checks that the submitted SKU list matches the original order quantities, preventing over‑refunds caused by duplicate SKU entries.

Finally, a ledger‑level amount check ensures the refund does not exceed the remaining refundable balance. Refund results are received asynchronously via MQ, updating the refund status and notifying downstream systems.

skuList:[{"skuCount":1,"skuName":"skuA","procotionType":"1"},{"skuCount":1,"skuName":"skuA","promotionType":"1"}]

Conclusion

The reverse‑flow after‑sale business depends on forward orders, and as forward‑order scenarios expand, the after‑sale system must continuously evolve. Ongoing challenges include the lack of a dedicated gateway and fragmented business logic. Future work aims to use a template engine for intelligent, configurable logic.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MicroservicesSystem Designdata synchronizationdistributed-lockafter-sales
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.