How Transparent Multilevel Cache (TMC) Eliminates Hotspot Bottlenecks in High‑Traffic E‑Commerce

The article explains Youzan’s Transparent Multilevel Cache (TMC), detailing its architecture, hotspot detection, local caching, consistency mechanisms, and real‑world performance gains during flash‑sale events, showing how it reduces cache pressure and improves latency for Java‑based services.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
How Transparent Multilevel Cache (TMC) Eliminates Hotspot Bottlenecks in High‑Traffic E‑Commerce

What Is TMC?

Transparent Multilevel Cache (TMC) is a cache‑as‑a‑service solution built by Youzan’s PaaS team to provide a unified, multi‑level caching layer for internal applications.

Why Build TMC?

E‑commerce merchants frequently run flash‑sale or promotion campaigns that create sudden "cache hotspot" traffic: a small number of keys receive massive request bursts, overwhelming the distributed cache cluster, consuming bandwidth, and destabilising services.

Hotspot events are unpredictable in timing, type, and product.

During a hotspot, a few hot keys generate a flood of cache requests that can saturate the network and degrade application stability.

TMC was created to automatically discover hotspots and pre‑place hotspot requests in an application‑level local cache.

Pain Points of Traditional Multilevel Caches

Fast and accurate hotspot detection.

Ensuring data consistency between the local cache and the downstream distributed cache.

Providing visibility into local‑cache hit rates and hotspot keys for validation.

Achieving transparent integration with minimal intrusion to existing applications.

Overall Architecture

The architecture consists of three layers:

Storage Layer : Provides basic KV storage using different back‑ends (Codis, Zankv, Aerospike) according to business needs.

Proxy Layer : Offers a unified cache entry point and routing for horizontally sharded data.

Application Layer : Supplies a unified client with built‑in hotspot detection and local caching, transparent to business logic.

The article focuses on the application‑layer client’s hotspot detection and local caching features.

Transparent Local Cache Integration

Java services can use either the standard

spring.data.redis
RedisTemplate

or the Youzan‑provided

youzan.framework.redis
RedisClient

. In both cases the client ultimately creates a JedisPool and a Jedis instance that communicates with the proxy layer.

TMC modifies the native JedisPool and Jedis classes so that during pool initialization the Hermes‑SDK (which implements hotspot detection and local caching) is also initialized.

When a key is requested, the client first asks Hermes‑SDK whether the key is a hotspot. If it is, the value is returned from the local cache without contacting the cache cluster; otherwise the request is forwarded to the cluster. All key‑access events are asynchronously reported to the Hermes server cluster for hotspot analysis.

For Java services, simply using a specific version of the jedis‑jar enables hotspot detection and local caching without any code changes.

Module Breakdown

Jedis‑Client : Direct entry point for Java applications; API identical to native Jedis.

Hermes‑SDK : Encapsulates hotspot detection and local caching logic.

Hermes Server Cluster : Receives access events, performs hotspot analysis, and pushes hotspot key lists to SDK instances.

Cache Cluster : Consists of proxy and storage layers, providing the distributed cache service.

Infrastructure : Etcd cluster and Apollo configuration center supply cluster‑wide configuration and push capabilities.

Basic Workflow

Key Retrieval : The client asks Hermes‑SDK if the key is a hotspot. Hot keys are served from the local cache; non‑hot keys are fetched from the cache cluster via a callback.

Key Expiration : When set(), del() or expire() is called, the client notifies Hermes‑SDK via invalid(). For hotspot keys, the local cache entry is invalidated immediately, and the invalidation event is broadcast through etcd to other SDK nodes for eventual consistency.

Hotspot Discovery : Hermes servers collect access events, run a sliding‑window aggregation every 3 seconds, and push the top‑N hotspot key list to SDKs via etcd.

Configuration Loading : Both SDK and server read runtime parameters (e.g., thresholds, black/white lists, etcd addresses) from Apollo.

Stability Measures

Asynchronous reporting of access events using rsyslog to avoid blocking business threads.

Dedicated thread pool with bounded queue for the communication module, isolating I/O from business execution.

Local cache size limited to 64 MB (LRU) to prevent JVM heap overflow.

Consistency Guarantees

Only hotspot keys are cached locally; the majority of keys remain in the distributed cache.

When a hotspot key changes, Hermes‑SDK invalidates the local entry and broadcasts the event via etcd, ensuring strong consistency for the cached key and eventual consistency across the cluster.

Hotspot Discovery Process

Data Collection

Hermes‑SDK writes key‑access events to rsyslog, which forwards them to Kafka. Each Hermes server node consumes the Kafka stream in real time.

Event fields: appName, uniqueKey, sendTime, weight.

Sliding Window (Hotness Window)

For each app and each key, a 10‑slot time wheel records the number of accesses in the last 3 seconds per slot, representing a 30‑second sliding window.

Aggregation

Every 3 seconds a mapping task aggregates the per‑slot counts into a total hotness value for each key and stores the result in Redis as a sorted set.

Hotspot Detection

The detection node reads the latest aggregation, selects the top‑N keys exceeding the hotness threshold, and pushes the hotspot list to SDK instances.

Feature Summary

Real‑Time

Events are reported in real time via rsyslog + Kafka; the 3‑second mapping task ensures that a newly emerging hotspot is detected within at most 3 seconds.

Accuracy

The sliding‑window aggregation accurately reflects recent access distribution, providing reliable hotness scores.

Scalability

Hermes server nodes are stateless; horizontal scaling is achieved by adding Kafka partitions. The sliding‑window and aggregation logic are multithreaded and scale with the number of apps.

Practical Results

Kuaishou Merchant Campaign

During a short‑lived product promotion, cache request volume and local‑cache hit volume both rose sharply, with a local‑cache hit rate approaching 80 %.

Double‑11 (Singles’ Day) Sample Applications

Graphs show increased request QPS and reduced response time (RT) for core services, demonstrating that TMC’s local cache offloads pressure from the distributed cache and improves latency.

Future Outlook

TMC already serves product, logistics, inventory, marketing, user, gateway, and messaging modules, with more applications being onboarded. Configuration options such as hotspot thresholds, hotspot key count, and black/white lists allow fine‑tuning for different business scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsJavaperformanceCacheConsistencyhotspot detectionmultilevel cache
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.