How BigCache Achieves Ultra‑Fast In‑Memory Caching in Go

This article explains how the Go library BigCache uses sharding, efficient hash functions, and memory‑optimised data structures to deliver high‑concurrency, low‑latency caching, while avoiding GC bottlenecks and providing practical code examples and performance benchmarks.

Radish, Keep Going!
Radish, Keep Going!
Radish, Keep Going!
How BigCache Achieves Ultra‑Fast In‑Memory Caching in Go

Introduction

BigCache is a high‑performance in‑memory cache library designed for scenarios that require high concurrency and low latency. The article explores its performance‑optimisation techniques, including sharding, efficient hash algorithms, and lock usage, with references to the source code.

Segmented Locking

From a user perspective, a cache behaves like a large hashtable that stores key/value pairs. Using a simple map[string][]byte protected by a sync.RWMutex works only when performance requirements are modest, because write operations become a bottleneck under heavy concurrency. BigCache adopts the Java ConcurrentMap approach: it splits a large hashtable into multiple small shards, each guarded by its own lock, a technique also used by MongoDB sharding and Go's third‑party concurrent‑map implementation.

type BigCache struct {
    shards    []*cacheShard
    shardMask uint64
    config    Config
}

func newBigCache(ctx context.Context, config Config, clock clock) (*BigCache, error) {
    cache := &BigCache{
        shards:    make([]*cacheShard, config.Shards),
        shardMask: uint64(config.Shards - 1),
        config:    config,
        // ... other fields omitted for brevity
    }
    for i := 0; i < config.Shards; i++ {
        cache.shards[i] = initNewShard(config, onRemove, clock)
    }
    return cache, nil
}

Each cache object computes hash(key) % N where N is the number of shards. When N is a power of two, the modulo operation can be replaced by a single bitwise AND: x mod N = x & (N‑1), eliminating the need for expensive division.

Choosing an Appropriate Hash Algorithm

A good hash algorithm should produce random‑looking values, be fast, and avoid extra memory allocations that increase GC pressure. BigCache’s default hash implementation is fnv64a, which operates entirely on the stack using bitwise operations.

type fnv64a struct{}

const (
    offset64 = 14695981039346656037
    prime64  = 1099511628211
)

func (f fnv64a) Sum64(key string) uint64 {
    var hash uint64 = offset64
    for i := 0; i < len(key); i++ {
        hash ^= uint64(key[i])
        hash *= prime64
    }
    return hash
}

Memory Optimisation

In Go, the garbage collector scans every element of a map during the mark phase. When a cache holds millions of objects, this scanning adds significant latency. Benchmarks comparing a map with pointer values ( map[int]*int) against a map with plain integers ( map[int]int) show that the former incurs a 4‑second GC pause for mark/scan, while the latter’s pause is only about 80 ms.

func BenchmarkPointMap(b *testing.B) {
    pointMap = make(map[int]*int)
    for i := 0; i < 10e6; i++ {
        pointMap[i] = &i
    }
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        delete(pointMap, i)
        pointMap[i] = &i
    }
}

func BenchmarkNoPointMap(b *testing.B) {
    noPointMap = make(map[int]int)
    for i := 0; i < 10e6; i++ {
        noPointMap[i] = i
    }
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        delete(noPointMap, i)
        noPointMap[i] = i
    }
}

Running the benchmarks with tracing confirms that the pointer‑based map spends over 2 % of wall time in GC, while the plain‑int map spends only about 0.04 %.

GC Mark/Scan Timeline
GC Mark/Scan Timeline
Benchmark Results
Benchmark Results

Solutions to Pointer‑Induced GC Overhead

Use off‑heap memory with manual Malloc() / Free() to bypass the runtime GC (risking memory leaks).

Adopt freecache, which stores keys and values in a ring buffer, eliminating pointers and achieving near‑zero GC overhead.

BigCache’s Internal Design

BigCache stores serialized cache entries in a pre‑allocated byte slice. The hash value is used as the key of a map[int]int, where the map value holds the offset of the entry in the byte slice. Deleting an entry removes its index from the map and zeroes the corresponding bytes in the queue, leaving “holes” that are later reclaimed by a background cleanup routine.

type cacheShard struct {
    hashmap     map[uint64]uint32
    entries     queue.BytesQueue
    lock        sync.RWMutex
    // ... other fields omitted
}

func (s *cacheShard) set(key string, hashedKey uint64, entry []byte) error {
    ts := uint64(s.clock.Epoch())
    s.lock.Lock()
    if prevIdx := s.hashmap[hashedKey]; prevIdx != 0 {
        if prevEntry, err := s.entries.Get(int(prevIdx)); err == nil {
            resetHashFromEntry(prevEntry)
            delete(s.hashmap, hashedKey)
        }
    }
    // ... handle eviction, wrap entry, push to queue
    s.hashmap[hashedKey] = uint64(idx)
    s.lock.Unlock()
    return nil
}
queue.BytesQueue

is a dynamically growing byte array; when a []byte is added, the data is copied to the tail. Deleting an element only clears its bytes, creating gaps that are later compacted.

Additional Details

BigCache does not support per‑key expiration; all entries expire based on the global config.LifeWindow setting.

The cache’s lifecycle is entirely controlled by config.LifeWindow, which cannot be overridden for individual keys.

memory optimizationshardingGoCachingbigcache
Radish, Keep Going!
Written by

Radish, Keep Going!

Personal sharing

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.