Designing a High‑Performance In‑Memory Cache: Structures, Locks, and Go Concurrency
This article explores the fundamentals of building a high‑performance in‑memory cache, covering the relationship between caches and KV stores, various cache types, core data structures such as hash tables, lock strategies, rehash techniques, memory management, and network models, with practical examples and Go‑based concurrency designs.
Preface
Cache is the most used tool in backend development because performance is a key characteristic. As a result, caching has become ubiquitous; almost every backend system adds a cache when facing performance problems.
Cache and KVDB
Cache and KVDB often appear together; when a KVDB is fast enough it can serve as a cache. Redis is commonly used as a cache, essentially treating a KVDB as a cache. Memcached, on the other hand, is a pure cache without persistence and only supports simple string key/value pairs.
Redis now provides many data structures and persistence, so most modern caches refer to Redis.
Cache Types
A cache provides faster access than the underlying data, which may be KVDB data, file data, or remote database data. Common cache categories are:
Database‑type cache: a KV store that mirrors data from a remote database.
File‑type cache: local files that cache remote data.
Memory‑type cache: in‑memory storage of hot data.
This article focuses on a simple memory‑type cache that only supports string keys and values.
Designing a Cache
Performance is the primary concern; design must consider three aspects: underlying data structure, memory management, and network model.
Underlying Data Structure
Cache typically stores key‑value pairs; hash tables are preferred because of O(1) lookup. A good hash function is essential to avoid collisions. The basic implementation is an open‑chain hash table (array + linked list).
Closed‑chain hash tables use a circular array where collisions are resolved by probing the next slot.
Locking: reads and writes require synchronization. For low write intensity a single read‑write lock per table may suffice; for higher write rates, finer‑grained locks per bucket or per group of buckets can reduce contention. Example: each bucket gets its own read‑write lock, increasing memory usage but improving performance.
Modern CPUs support lock‑free CAS operations; Go provides the atomic package. CAS can reduce lock overhead but adds complexity and may still suffer from thread‑switch issues.
String vs Integer Keys
String key comparison is slower than integer comparison. A common optimization stores a secondary integer hash value with each node; lookups compare the integer first, falling back to full string comparison only on a match.
Rehashing
When the load factor grows, rehashing is required to expand the bucket array. Strategies include:
Full rehash: recompute all keys into a new table (simple but causes service downtime).
Incremental migration: allocate a larger bucket array B, gradually move entries from old array A to B while serving requests, marking moved entries as invalid in A.
Lazy allocation: start with array A, and when usage reaches a threshold (e.g., 80%), begin migration to B.
These methods keep the service available at the cost of extra hash computations.
Memory Management
Frequent writes and deletions cause repeated allocation and deallocation. Using a custom memory pool reduces overhead compared to the language runtime allocator. For maximum performance, a system language like C is preferred; in Go, CGO can be used to manage memory manually.
Network Model
Network I/O also impacts cache performance. Memcached uses libevent with a master‑worker thread model; Redis uses epoll‑based multiplexing. A lightweight TCP server can be built with a fixed number of goroutine workers, each handling connections from a channel, or with a “fire‑and‑go” model where each accepted connection spawns a new goroutine.
Another, more “golang‑style” model simply launches a goroutine for each accepted connection.
Goroutine Scheduling
Go’s goroutine scheduler combines event‑driven callbacks and thread‑like execution: a single OS thread runs many goroutine code fragments, yielding the CPU on I/O or explicit calls like runtime.Gosched(). This reduces context‑switch overhead compared to OS threads while preserving a familiar programming model.
Examples show that a CPU‑bound function that never yields can starve other goroutines.
Conclusion
The author notes that the code is not yet complete, but the discussion covered cache fundamentals, data structures, locking, rehashing, memory pooling, and network models, providing a foundation for building a high‑performance in‑memory cache.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
