Mastering Redis: Core Data Structures, Persistence, Clustering and Advanced Cache Strategies
This comprehensive guide explains Redis fundamentals, its rich data types, differences from Memcached, thread model, persistence mechanisms (AOF, RDB, hybrid), memory eviction policies, clustering for high availability, cache design patterns, distributed locking, Redlock algorithm, and practical techniques for identifying and safely deleting big keys.
What Is Redis?
Redis is an in‑memory data store that performs all reads and writes directly in memory, delivering extremely fast latency. It is widely used for caching, message queues, distributed locks and other high‑performance scenarios.
Data Types
Redis provides native data structures:
String (binary‑safe, O(1) length lookup)
Hash
List
Set
Sorted Set (Zset)
Bitmap (since 2.2)
HyperLogLog (since 2.8)
GEO (since 3.2)
Stream (since 5.0)
All operations are atomic because a single thread processes commands.
Redis vs. Memcached
Redis supports richer data types; Memcached only stores simple key‑value strings.
Redis offers persistence; Memcached does not.
Redis has native clustering; Memcached relies on client‑side sharding.
Redis provides pub/sub, Lua scripting, transactions; Memcached lacks these features.
Why Use Redis as a MySQL Cache?
Redis’s high performance and high concurrency make it ideal for caching frequently accessed MySQL data, reducing disk I/O and speeding up response times.
Benefit 1 – High Performance: Accessing data from memory is orders of magnitude faster than reading from disk.
Benefit 2 – High Concurrency: A single Redis instance can handle >100 000 QPS, far surpassing MySQL’s single‑node capacity.
Internal Implementations of Core Types
String Implementation
Strings are stored using SDS (Simple Dynamic String), which records length, supports binary data, provides O(1) length retrieval, and safely expands on concatenation.
String internal implementation
SDS can store both text and binary data.
Length lookup is O(1) because the length is stored explicitly.
Safe concatenation prevents buffer overflows by auto‑expanding.
List Implementation
Older versions used a linked list or ziplist; since Redis 3.2 all lists are stored as quicklist, a hybrid of ziplist and linked list.
List internal implementation
If the list has < 512 elements and each element < 64 bytes, a ziplist is used.
Otherwise a doubly‑linked list is used.
From 3.2 onward, quicklist replaces both structures.
Hash Implementation
Hashes are stored as a ziplist when the number of fields < 512 and each field/value < 64 bytes; otherwise a hash table is used.
Hash internal implementation
Ziplist for small hashes.
Hash table for larger hashes.
Set Implementation
Sets use an integer set when all members are integers and the count < 512; otherwise a hash table.
Set internal implementation
Integer set for pure integer collections.
Hash table for mixed or larger sets.
Sorted Set (Zset) Implementation
Zsets use a ziplist for small sorted sets; otherwise a skiplist.
Zset internal implementation
Ziplist when < 128 elements and each element < 64 bytes.
Skiplist for larger sorted sets.
Thread Model
Redis processes client requests in a single main thread (event loop). Since 2.6 it spawns background I/O (BIO) threads for file closing, AOF fsync and lazy memory freeing. Since 4.0 an extra thread handles asynchronous memory release (lazyfree).
Background threads act as consumers of dedicated task queues:
BIO_CLOSE_FILE – closes files.
BIO_AOF_FSYNC – syncs AOF to disk.
BIO_LAZY_FREE – frees memory in the background.
Why Single‑Threaded Redis Is So Fast
Benchmarks show a single Redis instance can process >100 000 operations per second because all data resides in memory and the server avoids lock contention.
Persistence
Append‑Only File (AOF)
Every write command is appended to an AOF log. Three fsync policies control when data is flushed to disk:
appendfsync always # sync after every write
appendfsync everysec # sync once per second
appendfsync no # let the OS decideAOF rewrite ( bgrewriteaof) creates a compact file by re‑executing the current dataset as a series of commands.
RDB Snapshots
RDB creates a point‑in‑time binary dump of the dataset. SAVE blocks the main thread. BGSAVE forks a child process, avoiding blocking.
save 900 1
save 300 10
save 60 10000Hybrid Persistence (Redis 4.0+)
During AOF rewrite Redis writes an RDB‑style snapshot as the first part of the new AOF file, followed by incremental AOF commands. This combines fast restart (RDB) with minimal data loss (AOF).
Memory Eviction
When maxmemory is reached Redis can either refuse writes (policy noeviction) or evict keys. Eviction policies:
volatile‑random, volatile‑ttl, volatile‑lru, volatile‑lfu (only keys with an expire).
allkeys‑random, allkeys‑lru, allkeys‑lfu (any key).
LRU vs. LFU
Redis approximates LRU by sampling a few keys and discarding the least recently used. Since Redis 4.0 it also offers LFU, which tracks access frequency using a 24‑bit field split into timestamp and a logarithmic counter.
High Availability and Clustering
Redis provides three HA mechanisms:
Master‑Slave Replication – asynchronous replication, read‑only slaves.
Sentinel – monitors masters, performs automatic failover.
Redis Cluster – sharding across 16384 hash slots; each node holds a subset of slots.
Cluster slot assignment can be automatic (even distribution) or manual via CLUSTER ADDSLOTS.
Cache Design Patterns
Cache Snowball, Breakdown, Penetration
• Snowball: many keys expire simultaneously – mitigate by adding random jitter to TTL or keeping hot data warm via background refresh.
• Breakdown (hot‑key miss): protect hot keys with a mutex (e.g., SETNX) or keep them non‑expiring and refresh proactively.
• Penetration: requests for non‑existent data – block malicious queries, cache empty results, or use a Bloom filter to pre‑filter.
Cache Update Strategies
Cache‑Aside (read‑through/write‑through) – application reads from cache, falls back to DB, writes to DB then deletes/updates cache.
Read‑Through / Write‑Through – cache sits between app and DB, automatically loading and persisting data.
Write‑Back – writes only to cache and flushes asynchronously; rarely used with Redis.
Distributed Lock
Acquire lock with: SET lock_key unique_value NX PX 10000 Release safely with a Lua script that deletes only if the stored value matches:
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
endRedlock (Multi‑Node Distributed Lock)
Client attempts to acquire the same lock on N independent Redis nodes. If it succeeds on a majority (N/2+1) and the total acquisition time is less than the lock TTL, the lock is considered held. Unlocking is performed on all nodes using the same Lua script.
Handling Big Keys
Big keys are strings >10 KB or collections with >5 000 elements. They can block the server, generate large network traffic, and cause memory imbalance.
Finding big keys: redis-cli --bigkeys (run on a replica to avoid blocking the master).
Iterate with SCAN and use MEMORY USAGE (Redis 4.0+).
Analyze RDB files with third‑party tools such as RdbTools.
Deleting safely:
Delete in batches (e.g., HSCAN + HDEL, LTRIM, SSCAN + SREM, ZREMRANGEBYRANK).
Use asynchronous deletion with UNLINK (Redis 4.0+).
Enable lazy‑free options ( lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del) to avoid blocking the main thread.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
