Hotspot Detection and Local Cache Framework for High‑Traffic Applications
The presented hotspot detection and local‑cache framework leverages the HeavyKeeper streaming top‑k algorithm with decay‑based burst detection, integrates zero‑code SDK support and a whitelist‑enabled LRU cache, enabling a few megabytes of memory to achieve up to 85% hit rates and dramatically reduce Redis load in high‑traffic applications.
In many Internet applications, data access follows the 80/20 rule: the top 20% of data accounts for over 80% of traffic. Early in a product’s lifecycle a single database can handle the load, but as traffic grows a database alone becomes a bottleneck and a high‑performance cache is introduced. Because the cache size is much smaller than the database, it can serve most hot data, yet during popular events or sudden spikes the cache can still be overwhelmed.
The presentation first describes existing approaches:
Small‑table broadcast : each instance caches the whole table locally and updates it periodically. This reduces read latency but requires custom local‑cache code per service.
Active pre‑warming : real‑time traffic statistics identify top‑k live streams and push them to instances for local caching. This catches many hot streams but suffers from stale statistics and duplicated implementation effort.
To address these issues a unified hotspot detection and governance framework was built. The framework consists of two modules—hotspot detection and local cache—integrated into the cache SDK, providing:
Real‑time hotspot monitoring.
Automatic identification of bursty hotspots and their insertion into local cache.
Efficient local‑cache policies that increase hit rate without extra memory.
Whitelist support for pre‑registered activity IDs.
Zero‑code integration via configuration.
Hotspot detection theory
The core problem is computing top‑k items from a data stream under limited memory. Traditional hashmap‑plus‑heap works only for small datasets. Three streaming algorithms are examined:
3.1 Lossy Counting
Lossy Counting divides the stream into equal‑size windows determined by a desired error rate. Within each window it counts frequencies, then decrements all counters by one and discards zeros, limiting memory usage while preserving high‑frequency items. After processing all windows, the remaining items are sorted to obtain the top‑k.
3.2 Count‑Min Sketch
Count‑Min Sketch (CMS) uses a two‑dimensional integer array of width w and depth d , together with d independent hash functions. For each incoming element n , the element is hashed by each function to obtain an index idx = h_i(n) % w and the corresponding cell is incremented. The estimated count of an element is the minimum of its d counters:
f[j] = min_k CMS[k][h_k(j)]To retrieve top‑k, a min‑heap is maintained alongside the sketch, updating it whenever a new count exceeds the heap’s minimum.
3.3 HeavyKeeper
HeavyKeeper improves CMS by storing both a hash fingerprint and a counter in each cell. When a hash collision occurs, the counter is probabilistically decayed instead of simply incremented, using the decay probability P_decay = b^C (b < 1) , where C is the current counter value. This reduces over‑estimation caused by collisions while preserving high‑frequency items.
3.4 Theoretical Summary
Comparing the three algorithms shows HeavyKeeper achieves higher accuracy for the same memory footprint, so the framework adopts HeavyKeeper as its core top‑k engine.
Implementation
The framework defines a Topk interface:
// Topk algorithm interface.
type Topk interface {
// Add item and return if item is in the topk.
Add(item string, incr uint32) bool
// List all topk items.
List() []Item
// Expelled watch at the expelled items.
Expelled() <-chan Item
}The concrete HeavyKeeper implementation resides in the open‑source repository go‑kratos/aegis .
Performance Optimizations
When the counter C exceeds 256, the decay probability is approximated by b^256 to avoid expensive exponentiation.
For C < 256, a pre‑computed lookup table of b^i (i = 0..255) is used.
Burst Hotspot Detection Optimization
To make the system responsive to sudden spikes, the historical frequencies are uniformly decayed every second: C_i = C_i / n (n > 1). With n = 2 , a previously dominant item’s count quickly drops, allowing a new bursty item to surface in the top‑k within one second.
Experimental results show that adding the decay factor dramatically improves burst detection speed and local‑cache hit rate, while keeping overall request‑share accuracy.
Local Cache
When a hotspot is identified, the key is admitted to an LRU cache; non‑hot keys are rejected to preserve memory for hot items. The LRU implementation is standard, but the admission check is custom:
if ok := topk.Add(key, 1) {
lru.Add(key, value, ttl)
}A whitelist allows pre‑registered activity IDs to bypass detection and be cached immediately:
if ok := topk.Add(key, 1); ok {
lru.Add(key, value, ttl)
return
}
if ok := inWhiteList(key); ok {
lru.Add(key, value, ttl)
return
}Cache SDK Integration
The hotspot detection and local cache modules are packaged into the cache SDK so that existing Redis client code only needs a configuration flag to enable them. Example excerpt from the Redis SDK:
func (r *Redis) Do(ctx context.Context, cmd string, args ...interface{}) (reply interface{}, err error) {
key := format(cmd, args...)
if readCmd(cmd) {
if resp, ok := r.getFromLocalCache(key); ok {
return resp, nil
}
resp, err = r.getFromRemote(cmd, args...)
r.updateLocalCache(key, resp)
return resp, err
}
r.setRemote(cmd, args...)
if writeCmd(cmd) {
r.deleteLocalCache(key)
}
return nil, nil
}Updating the local cache incorporates hotspot admission:
func (r *Redis) updateLocalCache(key, resp) {
var added bool
if r.topk != nil {
added = r.topk.Add(key)
}
if added {
r.lru.Add(key, resp, r.localTTL)
} else if r.inWhiteList(key) {
r.lru.Add(key, resp, r.localTTL)
}
}Business Practice
In Bilibili’s production environment, hot videos and topics dominate traffic. After deploying the hotspot detection module, a few megabytes of local memory achieve a 35% cache hit rate during peak hours, and up to 85% during super‑hot events, dramatically reducing Redis load.
Conclusion
Hotspot problems can be mitigated by detecting and specially handling hot keys. The presented framework combines streaming top‑k algorithms, decay‑based burst detection, and a lightweight LRU cache with whitelist support, offering a practical solution for high‑availability services. For scenarios requiring strict consistency, the detection module can still be used for proactive governance without caching.
References:
Lossy Counting: https://micvog.files.wordpress.com/2015/06/approximate_freq_count_over_data_streams_vldb_2002.pdf
Count‑Min Sketch: http://dimacs.rutgers.edu/~graham/pubs/papers/cmencyc.pdf
HeavyKeeper: https://www.usenix.org/system/files/conference/atc18/atc18-gong.pdf
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.