How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture
This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.
Background
Coupon promotion data is a critical component of JD's search index. Each coupon batch ID binds a large set of items (SKU, category, vendor, etc.). Processing requires aggregating these items by batch ID to build a coupon index for product recall. Unlike regular product data, coupon data cannot be handled uniformly due to its highly structured nature.
Business Requirements and Pain Points
Two parallel pipelines (full and incremental) caused high resource consumption and frequent data inconsistency.
Full‑batch processing required wide‑table joins and aggregations, taking up to 10 hours and consuming 30 k CPU cores and 40 TB of storage, severely limiting iteration speed.
The KV‑based storage layer suffered from data skew, large keys, and high latency, leading to missed updates and timeouts during large‑scale coupon releases.
During major sales events, a single coupon could bind up to 6 billion SKUs, creating massive data‑skew and storage pressure.
Architecture Evolution
Unified Stream‑Batch Design
JD evaluated Apache Doris as a replacement for the KV engine. Doris offers high‑throughput batch processing, low latency, and linear horizontal scalability, making it suitable for both point queries and massive analytical workloads. By storing raw coupon details in Doris tables, JD could perform micro‑batch queries that replace thousands of KV point reads, achieving minute‑level data freshness.
Technical Choices
Key decisions included:
Using Doris as the core storage for index data instead of a KV store.
Leveraging Doris' sequence column to preserve message ordering.
Adopting a three‑module pipeline: data adaptation (topic/Hive → Doris), ingestion (Doris sequence ensures ordering, then emits update messages), and aggregation/outbound (aggregates by coupon batch ID).
Design Details
The unified pipeline supports two deployment modes for the aggregation module:
Point‑query mode: Flink processes incremental updates in real time.
Batch mode: Workers prepare full‑batch data, reusing the same code base to achieve full‑batch and incremental unification.
Benefits of the Stream‑Batch Refactor
Integration efficiency improved by 50 %.
Full‑batch and incremental data now share the same Doris engine, eliminating data inconsistency.
Full‑batch build time reduced from 10 h to 2 h; latency improved from daily to hour‑level, saving ~30 k core‑hours per day.
Coupon shard size reduced from 300 k to no truncation, enhancing user experience.
Data hot‑spot issues resolved, achieving balanced load.
Generalizable Architecture
The same framework has been applied to other JD scenarios, such as LBS local recommendation pools and order search data pipelines, demonstrating its reusability across domains.
Stability Enhancements
Simplified coupon outbound flow by removing intermediate services and persisting directly to CFS.
Adjusted JVM parameters and added back‑pressure and memory‑control mechanisms, stabilizing the full‑incremental service.
Implemented selective coupon ID updates to reduce update frequency.
Performance Results
Full‑batch migration speed increased by ~96 % (27 h → 1 h).
Resolved large‑coupon production bottlenecks during the 618 promotion.
Removed dependencies on Hadoop and Hive, shortening the data chain and improving stability.
Machine load dropped >80 %, eliminating full GC spikes.
Incremental processing time during Double‑12 fell by >96 %; average latency now under a few dozen seconds, with large‑coupon incremental delay reduced by 97 %.
CPU load on Doris decreased by >60 %; memory usage down >50 %; query volume reduced by 90 %.
Key Metrics
Resource consumption on the online side decreased by over 90 % after moving to incremental processing. Disk usage per shard fell from 14 GB to a few hundred MB, and shard execution time dropped from 1 h to 5 min. Overall system stability improved dramatically, with no observable CPU spikes.
Conclusion
Through a comprehensive redesign of the coupon search data pipeline, JD achieved minute‑level updates for billion‑scale coupon data while cutting compute and storage consumption by more than 90 %. The solution has been deployed across main site, vertical sites, and delivery services, and serves as an industry‑level reusable practice for large‑scale data processing.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
