Backend Development 15 min read

Optimizing JD.com Flash Sale Product Pool: Architecture Upgrade and Performance Tuning

This article details how JD.com’s flash‑sale system tackled rapid product‑pool growth by analyzing JVM memory issues, redesigning the architecture with double‑buffered updates, local LRU caching, system splitting, and Bloom‑filter integration, resulting in significant performance and stability improvements for large‑scale promotions.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Optimizing JD.com Flash Sale Product Pool: Architecture Upgrade and Performance Tuning

JD.com’s flash‑sale channel has experienced explosive growth in both product count and user traffic, prompting a need to expand the product pool beyond its original capacity and posing challenges to the existing system architecture.

To address the risk of product‑pool explosion, the flash‑sale backend team launched a dedicated expansion project, completing a ten‑million‑item pool upgrade before the 618 promotion. The following sections share the optimization experience.

Background

The flash‑sale channel consists of two parts: the core service serving end‑users and the product‑tagging service that maintains the product pool for downstream services such as product detail pages and shopping carts.

The original architecture cached the entire product pool in memory (JIMDB, a Redis‑like distributed cache) and used ZooKeeper notifications to refresh local caches. While this worked when the product count was modest, the rapid increase caused severe heap growth, frequent Minor and Full GCs, and noticeable performance spikes.

Problem Analysis

During peak promotions, the heap expanded quickly, Minor GC could not reclaim the new space, and Full GC became frequent, consuming CPU and causing interface latency spikes. Heap‑object histograms before and after Full GC showed a massive increase in temporary String objects (over 1.2 GB) generated by full‑pool updates.

Investigation revealed that large temporary objects were promoted to the old generation via the JVM’s dynamic tenuring mechanism, bypassing the default -XX:MaxTenuringThreshold of 6 and causing rapid old‑gen growth.

Attempts to solve the issue by adjusting young‑generation size, switching to G1, or manually setting tenuring thresholds proved ineffective.

Optimization Solutions

1. Double‑Buffer Timed Hash Update : Instead of full‑pool overwrites, products are hashed into buckets by SKU and updated at bucket granularity. A dual‑buffer design marks buckets for update on the read side and applies changes only at scheduled intervals, reducing update frequency and memory churn.

2. Local LRU Cache Integration : To break the memory ceiling, a local LRU cache (Caffeine) was introduced for hot items, while the full pool remains in JIMDB. Caffeine’s W‑TinyLFU algorithm provides high hit rates with low memory overhead.

3. System Split : The core service and tagging service were decoupled into separate deployments, allowing independent scaling and resource isolation. The core service keeps a reduced cache, while the tagging service adopts the new cache strategy.

4. Bloom Filter for Cache Penetration : To avoid repeated JIMDB lookups for non‑flash‑sale SKUs, a Bloom filter stores valid SKU identifiers, allowing the local cache to return empty placeholders for invalid keys, reducing cache miss storms.

Optimization Effects

After the architecture upgrade, extensive single‑machine, gray‑release, and full‑scale load tests confirmed stable performance. Compared with previous promotions, the new system showed a 90 % reduction in 99.9th‑percentile latency, a dramatic drop in GC frequency, and the ability to support tens of millions of products with horizontal scalability.

Conclusion

The flash‑sale product‑pool expansion project succeeded by redesigning the update mechanism, splitting services, adopting a hybrid cache (JIMDB + Caffeine + Bloom filter), and fine‑tuning JVM behavior, thereby achieving higher capacity, better performance, and improved stability for future large‑scale promotions.

JavaJVMPerformance OptimizationSystem ArchitecturecachingGC tuning
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.