How Transparent Multilevel Cache (TMC) Supercharges Java Application Performance
This article explains the design, architecture, and implementation of Transparent Multilevel Cache (TMC), a solution that adds application‑level hotspot detection and local caching to reduce cache hot‑spot pressure, improve consistency, and boost performance for Java services in high‑traffic scenarios.
TMC Overview
TMC (Transparent Multilevel Cache) is a comprehensive caching solution developed by Youzan PaaS to provide applications with hotspot detection, local caching, and cache‑hit statistics on top of a generic distributed cache such as CodisProxy + Redis or Youzan's own zanKV.
Why TMC?
E‑commerce merchants frequently run flash‑sale or promotion activities that create unpredictable cache‑hotspot access, causing a few hot keys to generate massive request volumes, saturate network bandwidth, and threaten service stability. TMC automatically discovers these hotspots and pre‑places their requests in an application‑level local cache.
Multilevel Cache Pain Points
Fast and accurate hotspot detection.
Data consistency between local and distributed caches.
Visibility of local cache hit rates and hotspot keys.
Transparent integration with minimal intrusion.
TMC Architecture
The architecture consists of three layers:
Storage layer: provides KV storage (Codis, zanKV, Aerospike, etc.).
Proxy layer: unified cache entry and routing for applications.
Application layer: client library with built‑in hotspot detection and local cache, transparent to business logic.
Local Cache in TMC
Transparent Integration
Java services using either spring.data.redis.RedisTemplate or youzan.framework.redis.RedisClient ultimately create a Jedis object via JedisPool. TMC wraps the native JedisPool and Jedis classes, injecting hotspot detection and local caching through the Hermes‑SDK without code changes.
Overall Structure
Jedis‑Client : direct interface to the cache server.
Hermes‑SDK : SDK that implements hotspot detection and local caching.
Hermes Server Cluster : receives access events, performs hotspot analysis, and pushes hotspot keys to SDKs.
Cache Cluster : proxy and storage layers providing distributed cache services.
Infrastructure : etcd cluster and Apollo configuration center for cluster push and unified configuration.
Basic Workflow
When an application requests a key, the Jedis‑Client asks Hermes‑SDK whether the key is a hotspot.
If it is a hotspot, the value is fetched from the local cache in Hermes‑SDK; otherwise the request is forwarded to the cache cluster.
Each key access event is asynchronously reported to the Hermes server via rsyslog → Kafka.
Key expiration events trigger invalid() in Hermes‑SDK, which invalidates the local entry and broadcasts the event through etcd for cluster‑wide consistency.
Hotspot Detection Process
Data Collection : Hermes‑SDK reports key access events to Kafka.
Hotness Sliding Window : each key maintains a 10‑slot time wheel, each slot representing 3 seconds of access count (30 seconds total).
Hotness Aggregation : every 3 seconds a mapping task aggregates the sliding‑window counts and stores the result in Redis sorted sets.
Hotspot Detection : the server cluster periodically selects top‑N keys exceeding a threshold and pushes the hotspot list to SDKs via etcd.
Stability
Asynchronous event reporting using rsyslog avoids blocking business threads.
Dedicated thread pools with bounded queues isolate I/O from business execution.
Local cache size is limited to 64 MB (LRU) to prevent JVM heap overflow.
Consistency
Only hotspot keys are cached locally; the majority of keys remain in the distributed cache.
Hotspot key updates trigger immediate local invalidation for strong consistency.
Invalidations are broadcast via etcd to achieve eventual consistency across all SDK instances.
Hotspot Discovery
Overall Flow
Data collection → sliding window → aggregation → detection → push to SDK.
Real‑World Effects
During a Kuaishou live‑stream promotion, TMC recorded a local cache hit rate of nearly 80 %, significantly reducing cache request latency and improving QPS while keeping response times lower despite traffic spikes.
Feature Summary
Real‑time : rsyslog + Kafka reports events instantly; sliding‑window aggregation runs every 3 seconds, detecting hotspots within that interval.
Accuracy : time‑wheel based sliding window provides precise recent access distribution.
Scalability : Hermes server nodes are stateless and can be horizontally scaled with Kafka partitions; sliding‑window and aggregation are multithreaded per app.
Future Outlook
TMC already serves product, logistics, inventory, marketing, user, and gateway modules, with more applications being onboarded. Configuration flexibility allows tuning of hotspot thresholds, detection counts, and black‑/white‑lists for optimal results.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Interview Crash Guide
Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
