Deep Dive into FreeCache: Architecture, Zero GC, High Concurrency, and Approximate LRU
This article analyzes the Go‑implemented FreeCache library, detailing its project features, internal data structures, zero‑GC design, segment‑based locking for high‑concurrency thread‑safe access, the set operation workflow, index expansion, and its approximate LRU eviction strategy.
In low‑latency, high‑concurrency systems, local memory caches are inevitable; FreeCache is a Go‑based in‑process cache that is already deployed in production due to its excellent characteristics. This article explores its source code to understand how it works.
Project URL and Features
Project address: https://github.com/coocood/freecache
Stores billions of entries
Zero GC overhead
High‑concurrency thread‑safe access
Pure Go implementation
Supports key expiration
Approximate LRU eviction algorithm
Strict memory usage limit
Iterator support
Internal Data Structures
FreeCache achieves zero GC and high concurrency by dividing the cache into 256 segments, each protected by its own mutex. The main structures are shown below:
const (
segmentCount = 256
segmentAndOpVal = 255
)
type Cache struct {
locks [segmentCount]sync.Mutex
segments [segmentCount]segment
}
type segment struct {
rb RingBuf // ring buffer
segId int // hashVal & 255
hitCount int64
missCount int64
entryCount int64
totalCount int64 // total entries in ring buffer
totalTime int64 // used for LRU calculation
totalEvacuate int64
totalExpired int64
overwrites int64
vacuumLen int64 // available capacity of ring buffer
slotLens [256]int32 // length of each slot index
slotCap int32 // capacity of each slot; triggers expansion when reached
slotsData []entryPtr // shared index slice for all slots
}
type RingBuf struct {
begin int64 // start position in ring buffer
end int64 // end position in ring buffer
data []byte // raw entry storage (fixed capacity)
index int
}The diagram (not reproduced here) shows that each segment contains 256 index slots and a shared RingBuf for actual entry storage.
Why High‑Concurrency Thread‑Safe Access Works
For set/get/del operations, FreeCache computes a 64‑bit hash of the key using xxhash, derives segId = hashVal & 255 to locate the segment, and locks only that segment. Because locking is per‑segment, different segments can be accessed concurrently.
const (
segmentCount = 256
segmentAndOpVal = 255
)
type Cache struct {
locks [segmentCount]sync.Mutex
segments [segmentCount]segment
}
type segment struct {
rb RingBuf
segId int
// ... other fields omitted for brevity
}Why the GC Overhead Is Near Zero
The underlying structure holds only 512 pointers (256 segments, each with one slotsData slice and one RingBuf ), so the amount of heap‑allocated objects is minimal, resulting in almost no GC work.
Set Operation Flow
The set process first checks whether the key already exists. It uses the segment ID and slot ID to locate the appropriate index slot, then performs a binary search (O(log n)) on the ordered hash16 values.
// Determine if the key exists in the index slice
slotId := uint8(hashVal >> 8)
hash16 := uint16(hashVal >> 16)
slot := seg.getSlot(slotId)
idx, match := seg.lookup(slot, hash16, key)If the entry exists and its allocated capacity is sufficient, FreeCache updates the entry in place (O(1)). Otherwise, the old entry is marked as deleted and a new entry is appended to the tail of the RingBuf (also O(1)). When the key does not exist, the index is inserted into the sorted slot slice, which may require shifting existing entries (O(n)) and possibly expanding the slot slice (capacity doubles).
// Insert an index into the slot slice
func (seg *segment) insertEntryPtr(slotId uint8, hash16 uint16, offset int64, idx int, keyLen uint16) {
if seg.slotLens[slotId] == seg.slotCap {
seg.expand()
}
seg.slotLens[slotId]++
atomic.AddInt64(&seg.entryCount, 1)
slot := seg.getSlot(slotId)
copy(slot[idx+1:], slot[idx:])
slot[idx].offset = offset
slot[idx].hash16 = hash16
slot[idx].keyLen = keyLen
}
func (seg *segment) expand() {
newSlotData := make([]entryPtr, seg.slotCap*2*256)
for i := 0; i < 256; i++ {
off := int32(i) * seg.slotCap
copy(newSlotData[off*2:], seg.slotsData[off:off+seg.slotLens[i]])
}
seg.slotCap *= 2
seg.slotsData = newSlotData
}Approximate LRU Eviction Process
FreeCache does not implement a strict LRU. It calculates the average access timestamp of all entries ( totalTime / totalCount ) and evicts any entry whose accessTime is less than or equal to this average. This yields an approximate LRU behavior but can occasionally evict relatively new entries. To avoid endless loops when no entry satisfies the condition, the first entry is forcibly evicted after six consecutive failed attempts.
leastRecentUsed := int64(oldHdr.accessTime) <= atomic.LoadInt64(&seg.totalTime)/atomic.LoadInt64(&seg.totalCount)Key Expiration and Cleanup Strategy
Expired keys are not removed proactively. When a key expires, it remains in the index until a subsequent get/touch or eviction pass marks the index as deleted. The actual entry data is only reclaimed when the RingBuf needs space and the eviction process removes the marked entry, freeing capacity for new data.
360 Smart Cloud
Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.