Databases 68 min read

Mastering Redis: Core Data Structures, Persistence, Clustering and Advanced Cache Strategies

This comprehensive guide explains Redis fundamentals, its rich data types, differences from Memcached, thread model, persistence mechanisms (AOF, RDB, hybrid), memory eviction policies, clustering for high availability, cache design patterns, distributed locking, Redlock algorithm, and practical techniques for identifying and safely deleting big keys.

ITPUB

Aug 24, 2023

Mastering Redis: Core Data Structures, Persistence, Clustering and Advanced Cache Strategies

What Is Redis?

Redis is an in‑memory data store that performs all reads and writes directly in memory, delivering extremely fast latency. It is widely used for caching, message queues, distributed locks and other high‑performance scenarios.

Data Types

Redis provides native data structures:

String (binary‑safe, O(1) length lookup)

Hash

List

Set

Sorted Set (Zset)

Bitmap (since 2.2)

HyperLogLog (since 2.8)

GEO (since 3.2)

Stream (since 5.0)

All operations are atomic because a single thread processes commands.

Redis vs. Memcached

Redis supports richer data types; Memcached only stores simple key‑value strings.

Redis offers persistence; Memcached does not.

Redis has native clustering; Memcached relies on client‑side sharding.

Redis provides pub/sub, Lua scripting, transactions; Memcached lacks these features.

Why Use Redis as a MySQL Cache?

Redis’s high performance and high concurrency make it ideal for caching frequently accessed MySQL data, reducing disk I/O and speeding up response times.

Benefit 1 – High Performance: Accessing data from memory is orders of magnitude faster than reading from disk.

Benefit 2 – High Concurrency: A single Redis instance can handle >100 000 QPS, far surpassing MySQL’s single‑node capacity.

Internal Implementations of Core Types

String Implementation

Strings are stored using SDS (Simple Dynamic String), which records length, supports binary data, provides O(1) length retrieval, and safely expands on concatenation.

String internal implementation

SDS can store both text and binary data.

Length lookup is O(1) because the length is stored explicitly.

Safe concatenation prevents buffer overflows by auto‑expanding.

List Implementation

Older versions used a linked list or ziplist; since Redis 3.2 all lists are stored as quicklist, a hybrid of ziplist and linked list.

List internal implementation

If the list has < 512 elements and each element < 64 bytes, a ziplist is used.

Otherwise a doubly‑linked list is used.

From 3.2 onward, quicklist replaces both structures.

Hash Implementation

Hashes are stored as a ziplist when the number of fields < 512 and each field/value < 64 bytes; otherwise a hash table is used.

Hash internal implementation

Ziplist for small hashes.

Hash table for larger hashes.

Set Implementation

Sets use an integer set when all members are integers and the count < 512; otherwise a hash table.

Set internal implementation

Integer set for pure integer collections.

Hash table for mixed or larger sets.

Sorted Set (Zset) Implementation

Zsets use a ziplist for small sorted sets; otherwise a skiplist.

Zset internal implementation

Ziplist when < 128 elements and each element < 64 bytes.

Skiplist for larger sorted sets.

Thread Model

Redis processes client requests in a single main thread (event loop). Since 2.6 it spawns background I/O (BIO) threads for file closing, AOF fsync and lazy memory freeing. Since 4.0 an extra thread handles asynchronous memory release (lazyfree).

Background threads act as consumers of dedicated task queues:

BIO_CLOSE_FILE – closes files.

BIO_AOF_FSYNC – syncs AOF to disk.

BIO_LAZY_FREE – frees memory in the background.

Why Single‑Threaded Redis Is So Fast

Benchmarks show a single Redis instance can process >100 000 operations per second because all data resides in memory and the server avoids lock contention.

Persistence

Append‑Only File (AOF)

Every write command is appended to an AOF log. Three fsync policies control when data is flushed to disk:

appendfsync always   # sync after every write
appendfsync everysec # sync once per second
appendfsync no       # let the OS decide

AOF rewrite ( bgrewriteaof) creates a compact file by re‑executing the current dataset as a series of commands.

RDB Snapshots

RDB creates a point‑in‑time binary dump of the dataset. SAVE blocks the main thread. BGSAVE forks a child process, avoiding blocking.

save 900 1
save 300 10
save 60 10000

Hybrid Persistence (Redis 4.0+)

During AOF rewrite Redis writes an RDB‑style snapshot as the first part of the new AOF file, followed by incremental AOF commands. This combines fast restart (RDB) with minimal data loss (AOF).

Memory Eviction

When maxmemory is reached Redis can either refuse writes (policy noeviction) or evict keys. Eviction policies:

volatile‑random, volatile‑ttl, volatile‑lru, volatile‑lfu (only keys with an expire).

allkeys‑random, allkeys‑lru, allkeys‑lfu (any key).

LRU vs. LFU

Redis approximates LRU by sampling a few keys and discarding the least recently used. Since Redis 4.0 it also offers LFU, which tracks access frequency using a 24‑bit field split into timestamp and a logarithmic counter.

High Availability and Clustering

Redis provides three HA mechanisms:

Master‑Slave Replication – asynchronous replication, read‑only slaves.

Sentinel – monitors masters, performs automatic failover.

Redis Cluster – sharding across 16384 hash slots; each node holds a subset of slots.

Cluster slot assignment can be automatic (even distribution) or manual via CLUSTER ADDSLOTS.

Cache Design Patterns

Cache Snowball, Breakdown, Penetration

• Snowball: many keys expire simultaneously – mitigate by adding random jitter to TTL or keeping hot data warm via background refresh.

• Breakdown (hot‑key miss): protect hot keys with a mutex (e.g., SETNX) or keep them non‑expiring and refresh proactively.

• Penetration: requests for non‑existent data – block malicious queries, cache empty results, or use a Bloom filter to pre‑filter.

Cache Update Strategies

Cache‑Aside (read‑through/write‑through) – application reads from cache, falls back to DB, writes to DB then deletes/updates cache.

Read‑Through / Write‑Through – cache sits between app and DB, automatically loading and persisting data.

Write‑Back – writes only to cache and flushes asynchronously; rarely used with Redis.

Distributed Lock

Acquire lock with: SET lock_key unique_value NX PX 10000 Release safely with a Lua script that deletes only if the stored value matches:

if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

Redlock (Multi‑Node Distributed Lock)

Client attempts to acquire the same lock on N independent Redis nodes. If it succeeds on a majority (N/2+1) and the total acquisition time is less than the lock TTL, the lock is considered held. Unlocking is performed on all nodes using the same Lua script.

Handling Big Keys

Big keys are strings >10 KB or collections with >5 000 elements. They can block the server, generate large network traffic, and cause memory imbalance.

Finding big keys: redis-cli --bigkeys (run on a replica to avoid blocking the master).

Iterate with SCAN and use MEMORY USAGE (Redis 4.0+).

Analyze RDB files with third‑party tools such as RdbTools.

Deleting safely:

Delete in batches (e.g., HSCAN + HDEL, LTRIM, SSCAN + SREM, ZREMRANGEBYRANK).

Use asynchronous deletion with UNLINK (Redis 4.0+).

Enable lazy‑free options ( lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del) to avoid blocking the main thread.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Redis Persistence cluster Distributed Lock In-Memory Database

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

What Is Redis?

Data Types

Redis vs. Memcached

Why Use Redis as a MySQL Cache?

Internal Implementations of Core Types

String Implementation

List Implementation

Hash Implementation

Set Implementation

Sorted Set (Zset) Implementation

Thread Model

Why Single‑Threaded Redis Is So Fast

Persistence

Append‑Only File (AOF)

RDB Snapshots

Hybrid Persistence (Redis 4.0+)

Memory Eviction

LRU vs. LFU

High Availability and Clustering

Cache Design Patterns

Cache Snowball, Breakdown, Penetration

Cache Update Strategies

Distributed Lock

Redlock (Multi‑Node Distributed Lock)

Handling Big Keys

ITPUB

How this landed with the community

Was this worth your time?

0 Comments

Hybrid Persistence (Redis 4.0+)