Backend Development 21 min read

Hotspot Detection and Local Cache Framework for High‑Traffic Applications

The presented hotspot detection and local‑cache framework leverages the HeavyKeeper streaming top‑k algorithm with decay‑based burst detection, integrates zero‑code SDK support and a whitelist‑enabled LRU cache, enabling a few megabytes of memory to achieve up to 85% hit rates and dramatically reduce Redis load in high‑traffic applications.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Hotspot Detection and Local Cache Framework for High‑Traffic Applications

In many Internet applications, data access follows the 80/20 rule: the top 20% of data accounts for over 80% of traffic. Early in a product’s lifecycle a single database can handle the load, but as traffic grows a database alone becomes a bottleneck and a high‑performance cache is introduced. Because the cache size is much smaller than the database, it can serve most hot data, yet during popular events or sudden spikes the cache can still be overwhelmed.

The presentation first describes existing approaches:

Small‑table broadcast : each instance caches the whole table locally and updates it periodically. This reduces read latency but requires custom local‑cache code per service.

Active pre‑warming : real‑time traffic statistics identify top‑k live streams and push them to instances for local caching. This catches many hot streams but suffers from stale statistics and duplicated implementation effort.

To address these issues a unified hotspot detection and governance framework was built. The framework consists of two modules—hotspot detection and local cache—integrated into the cache SDK, providing:

Real‑time hotspot monitoring.

Automatic identification of bursty hotspots and their insertion into local cache.

Efficient local‑cache policies that increase hit rate without extra memory.

Whitelist support for pre‑registered activity IDs.

Zero‑code integration via configuration.

Hotspot detection theory

The core problem is computing top‑k items from a data stream under limited memory. Traditional hashmap‑plus‑heap works only for small datasets. Three streaming algorithms are examined:

3.1 Lossy Counting

Lossy Counting divides the stream into equal‑size windows determined by a desired error rate. Within each window it counts frequencies, then decrements all counters by one and discards zeros, limiting memory usage while preserving high‑frequency items. After processing all windows, the remaining items are sorted to obtain the top‑k.

3.2 Count‑Min Sketch

Count‑Min Sketch (CMS) uses a two‑dimensional integer array of width w and depth d , together with d independent hash functions. For each incoming element n , the element is hashed by each function to obtain an index idx = h_i(n) % w and the corresponding cell is incremented. The estimated count of an element is the minimum of its d counters:

f[j] = min_k CMS[k][h_k(j)]

To retrieve top‑k, a min‑heap is maintained alongside the sketch, updating it whenever a new count exceeds the heap’s minimum.

3.3 HeavyKeeper

HeavyKeeper improves CMS by storing both a hash fingerprint and a counter in each cell. When a hash collision occurs, the counter is probabilistically decayed instead of simply incremented, using the decay probability P_decay = b^C (b < 1) , where C is the current counter value. This reduces over‑estimation caused by collisions while preserving high‑frequency items.

3.4 Theoretical Summary

Comparing the three algorithms shows HeavyKeeper achieves higher accuracy for the same memory footprint, so the framework adopts HeavyKeeper as its core top‑k engine.

Implementation

The framework defines a Topk interface:

// Topk algorithm interface.
 type Topk interface {
     // Add item and return if item is in the topk.
     Add(item string, incr uint32) bool
     // List all topk items.
     List() []Item
     // Expelled watch at the expelled items.
     Expelled() <-chan Item
 }

The concrete HeavyKeeper implementation resides in the open‑source repository go‑kratos/aegis .

Performance Optimizations

When the counter C exceeds 256, the decay probability is approximated by b^256 to avoid expensive exponentiation.

For C < 256, a pre‑computed lookup table of b^i (i = 0..255) is used.

Burst Hotspot Detection Optimization

To make the system responsive to sudden spikes, the historical frequencies are uniformly decayed every second: C_i = C_i / n (n > 1). With n = 2 , a previously dominant item’s count quickly drops, allowing a new bursty item to surface in the top‑k within one second.

Experimental results show that adding the decay factor dramatically improves burst detection speed and local‑cache hit rate, while keeping overall request‑share accuracy.

Local Cache

When a hotspot is identified, the key is admitted to an LRU cache; non‑hot keys are rejected to preserve memory for hot items. The LRU implementation is standard, but the admission check is custom:

if ok := topk.Add(key, 1) {
    lru.Add(key, value, ttl)
}

A whitelist allows pre‑registered activity IDs to bypass detection and be cached immediately:

if ok := topk.Add(key, 1); ok {
    lru.Add(key, value, ttl)
    return
}
if ok := inWhiteList(key); ok {
    lru.Add(key, value, ttl)
    return
}

Cache SDK Integration

The hotspot detection and local cache modules are packaged into the cache SDK so that existing Redis client code only needs a configuration flag to enable them. Example excerpt from the Redis SDK:

func (r *Redis) Do(ctx context.Context, cmd string, args ...interface{}) (reply interface{}, err error) {
    key := format(cmd, args...)
    if readCmd(cmd) {
        if resp, ok := r.getFromLocalCache(key); ok {
            return resp, nil
        }
        resp, err = r.getFromRemote(cmd, args...)
        r.updateLocalCache(key, resp)
        return resp, err
    }
    r.setRemote(cmd, args...)
    if writeCmd(cmd) {
        r.deleteLocalCache(key)
    }
    return nil, nil
}

Updating the local cache incorporates hotspot admission:

func (r *Redis) updateLocalCache(key, resp) {
    var added bool
    if r.topk != nil {
        added = r.topk.Add(key)
    }
    if added {
        r.lru.Add(key, resp, r.localTTL)
    } else if r.inWhiteList(key) {
        r.lru.Add(key, resp, r.localTTL)
    }
}

Business Practice

In Bilibili’s production environment, hot videos and topics dominate traffic. After deploying the hotspot detection module, a few megabytes of local memory achieve a 35% cache hit rate during peak hours, and up to 85% during super‑hot events, dramatically reducing Redis load.

Conclusion

Hotspot problems can be mitigated by detecting and specially handling hot keys. The presented framework combines streaming top‑k algorithms, decay‑based burst detection, and a lightweight LRU cache with whitelist support, offering a practical solution for high‑availability services. For scenarios requiring strict consistency, the detection module can still be used for proactive governance without caching.

References:

Lossy Counting: https://micvog.files.wordpress.com/2015/06/approximate_freq_count_over_data_streams_vldb_2002.pdf

Count‑Min Sketch: http://dimacs.rutgers.edu/~graham/pubs/papers/cmencyc.pdf

HeavyKeeper: https://www.usenix.org/system/files/conference/atc18/atc18-gong.pdf

stream processingCache Optimizationdistributed cachingheavykeeperhotspot detectiontopk algorithms
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.