Redis Deep Dive: Pipelines, Pub/Sub, Persistence, Locks & Cluster
This comprehensive guide explores Redis fundamentals and advanced features, covering pipelines for reduced RTT, publish/subscribe messaging, key expiration strategies, transaction behavior, persistence mechanisms (RDB, AOF, hybrid), distributed locking techniques, sentinel high‑availability, and cluster sharding, with practical code examples and diagrams.
Redis Basics
If you are not familiar with Redis, you can first read this article https://www.cnblogs.com/wugongzi/p/12841273.html which introduces what Redis is and how to use it.
Redis Pipeline
Normally we use Redis by sending a command, the command is queued, Redis executes it, and then returns the result. This process is called Round Trip Time (RTT). If multiple commands need to be executed, it consumes N RTTs and N IO transmissions, which is inefficient.
Therefore Redis pipeline was introduced. With a pipeline, the server can process a new request even if the previous request has not been responded to yet. This allows multiple commands to be sent to the server without waiting for replies, and the replies are read in a single step, reducing RTT and improving efficiency.
Important Note : When sending commands with a pipeline, the server will be forced to reply with a queue of responses, consuming a lot of memory. If you need to send a large number of commands, it is best to batch them in reasonable sizes (e.g., 10K commands, read the replies, then send another 10K commands). This achieves almost the same speed, but the reply queue for 10K commands requires a very large amount of memory to organize the returned data.
Redis Publish/Subscribe
Publish/Subscribe is a messaging pattern where the sender (publisher) sends messages and the subscriber receives messages.
As shown in the figure, Pub/Sub is based on channels. A channel can have multiple subscribers and multiple publishers. Any publisher can publish a message to the channel, and all subscribers will receive the message.
Pub/Sub Demo
I started 4 Redis clients on the server: 2 subscribers and 2 publishers. The subscribers subscribe to channel channel01.
Step 1: Publisher 1 sends message "wugongzi" to channel01.
Subscriber 1 receives the message:
Subscriber 2 receives the message:
Step 2: Publisher 2 sends message "hello-redis" to the channel.
Subscriber 1 receives the message:
Subscriber 2 receives the message:
Redis can also support subscribing to multiple channels:
Redis Expiration Strategy
Expiration Time Usage
Redis can set an expiration time for each key; when the expiration time is reached, Redis automatically deletes the key.
In production, we must set an expiration time for every Redis key. If no expiration is set, memory will eventually fill up with cold data.
Command to set expiration: EXPIRE key seconds Note: each time you update a key, you need to reset its expiration time; otherwise the key becomes permanent. Example usage:
Expiration Deletion Strategies
Redis keys can expire in two ways: passive and active.
Passive: when a client tries to access an expired key, Redis discovers it is expired and deletes it.
Active: Redis periodically checks and deletes expired keys because some keys may never be accessed.
Redis performs the following steps ten times per second:
Test a random sample of 20 keys for expiration.
Delete all keys that have expired.
If more than 25% of keys are expired, repeat step 1.
This probabilistic algorithm ensures that at any moment at most 1/4 of expired keys are removed.
Redis Transactions
Basic Transaction Usage
Redis transactions can execute multiple commands atomically. Characteristics:
All commands are serialized and executed in order without interruption.
All commands are either all executed or none are executed.
Transactions are implemented with the commands MULTI, EXEC, DISCARD, and WATCH. MULTI starts a transaction, EXEC commits it, DISCARD aborts it, and WATCH provides check‑and‑set behavior.
Transaction Errors
Redis transaction errors fall into two categories.
1) Error before transaction commit (error occurs while queuing commands). Example:
The incorrect INCR command was not queued, and the transaction failed; k1 and k2 have no values.
2) Error after transaction commit (error occurs while executing commands). Example:
In this case the transaction still commits, demonstrating that Redis transactions do not support rollback.
Why Redis Does Not Support Rollback
Unlike relational databases, Redis continues executing remaining commands after a failure. This design has advantages:
Failures are usually due to programming errors that should be caught during development, not in production.
Not needing rollback keeps Redis simple and fast.
Rollback cannot fix logical errors such as incrementing by the wrong amount or using the wrong key type.
Discard Transaction
When DISCARD is executed, the transaction is aborted, the command queue is cleared, and the client exits the transaction state.
WATCH Command Usage
WATCHmakes EXEC conditional: the transaction only executes if all watched keys have not been modified. If any watched key changes, the transaction is not executed.
In the demo, WATCH detected that k1 was modified before EXEC, so the transaction was not committed.
Redis Scripts and Transactions
From a definition standpoint, a Redis script is itself a transaction; anything that can be done in a transaction can be done in a script, often more simply and faster.
Scripts were introduced in Redis 2.6, while transactions existed earlier, so both mechanisms coexist.
Redis Persistence
Why Persistence Is Needed
Redis is an in‑memory database designed for high performance. Unlike MySQL, whose data is stored on disk, Redis data would be lost if the server crashes. Persistence mechanisms ensure data is not lost on crashes.
Persistence Overview
Redis provides two persistence mechanisms: RDB (snapshot) and AOF (append‑only file), suitable for different scenarios.
RDB creates point‑in‑time snapshots at configured intervals.
AOF logs every write operation; on restart, Redis replays the log. AOF can be rewritten in the background to keep file size manageable.
RDB
RDB persistence works by taking snapshots at specified intervals (e.g., at 8 am). Trigger methods include:
Executing SAVE (blocks Redis; not recommended for normal operation).
Executing BGSAVE (forks a child process; Redis continues serving requests).
Configuring save rules in the configuration file (e.g., save 900 1, save 300 10, save 60 10000).
During RDB, Redis forks a child process to write the snapshot, allowing the parent to keep serving requests.
Interview question example: If an RDB snapshot starts at 8 am and takes 2 minutes, and 100 keys are modified during the snapshot, does the RDB file contain the state at 8 am or the modified state? The answer is that the snapshot reflects the state at the moment the child process started (8 am).
RDB Advantages
Compact file; suitable for backups.
Forked child handles persistence, maximizing Redis performance.
Faster recovery for large datasets compared to AOF.
RDB Disadvantages
If the backup interval is long, data loss can be significant (e.g., a crash half an hour after the last snapshot).
AOF
AOF persistence logs every write operation. On restart, Redis replays the log to reconstruct the dataset. AOF can be rewritten to reduce file size.
AOF Configuration
# Whether to enable AOF
aappendonly no
# AOF file name
appendfilename "appendonly.aof"
# AOF sync policy
#appendfsync always # write to disk after every command (slow)
appendfsync everysec # default: sync every second
#appendfsync no # let OS handle sync (may lose up to 30 seconds)
# Whether to sync during rewrite
no-appendfsync-on-rewrite no
# Rewrite trigger policy
auto-aof-rewrite-percentage 100 # percentage to trigger rewrite (0 disables auto rewrite)
auto-aof-rewrite-min-size 64mb # minimum file size to trigger rewrite
# How to handle truncated AOF on load
aof-load-truncated yes
# Use RDB preamble in AOF
#aof-use-rdb-preamble yesAOF File Writing
AOF writes commands to a buffer; when the buffer is flushed, the data is written to disk.
appendfsync always # write to disk after every command (slow)
appendfsync everysec # default: sync every second
appendfsync no # OS handles sync; up to 30 seconds lossAOF Rewrite
Over time, the AOF file grows. Redis rewrites it to a compact form, keeping only the latest state for each key.
-- before rewrite
set k1 20
set k2 40
set k1 35
set k3 34
set k2 19
-- after rewrite (final values only)
set k1 35
set k3 34
set k2 19Hybrid Persistence
Since Redis 4.0, hybrid persistence combines AOF and RDB. The aof-use-rdb-preamble option (default no) controls whether the AOF file starts with an RDB snapshot followed by incremental AOF commands. no: pure AOF format. yes: RDB preamble + AOF tail.
Hybrid persistence benefits:
Significantly reduces AOF file size.
Speeds up recovery because the RDB part loads quickly.
AOF Recovery
Two modes:
Pure AOF: replay all commands.
RDB+AOF: load RDB snapshot first, then replay remaining AOF commands.
AOF Advantages
Better real‑time durability; less data loss.
Hybrid persistence keeps file size under control and improves load speed.
AOF Disadvantages
AOF files are usually larger than RDB files for the same data.
Depending on sync policy, AOF can be slightly slower than RDB.
Recovery speed is slower than RDB.
Redis Distributed Lock
Introduction
In Java, synchronized or Lock solves concurrency on a single node, but fails in a distributed environment. Distributed locks (e.g., for red‑packet or flash‑sale scenarios) are needed.
Lock Characteristics
Mutual exclusion: only one client can hold the lock at a time.
Safety: only the client that acquired the lock can release it.
High availability and performance: lock acquisition and release consume little time.
Lock timeout: if a client crashes, the lock should auto‑expire.
Re‑entrancy: a client can reacquire the lock while holding it.
Solution 1: SETNX
Use SETNX key value. Returns 1 on success, 0 on failure.
if (setnx(k1, v1) == 1) {
try {
// business logic
} finally {
del k1;
}
}This approach fails if the client crashes after acquiring the lock because the lock is never released.
Solution 2: SETNX + EXPIRE
Immediately set an expiration after acquiring the lock.
if (setnx(k1, v1) == 1) {
expire(k1, 10);
try {
// business logic
} finally {
del k1;
}
}The SETNX and EXPIRE commands are not atomic; a crash between them can leave the lock without an expiration.
Solution 3: Lua Script for Atomic SETNX+EXPIRE
if redis.call('setnx', KEYS[1], ARGV[1]) == 1 then
redis.call('expire', KEYS[1], ARGV[2])
return 1
else
return 0
endEven with atomic script, race conditions can still occur when the lock expires before the business logic finishes.
Solution 4: SET EX PX NX with Unique Value
Use SET key value NX EX seconds (or PX for milliseconds) to set the lock atomically with a unique token.
if (jedis.set(resource_name, random_value, "NX", "EX", 100) == 1) {
try {
// business logic
} finally {
if (random_value.equals(jedis.get(resource_name))) {
jedis.del(resource_name);
}
}
}Deletion must verify the token; otherwise a client may delete another's lock. This check can also be done atomically with Lua:
if redis.call('get', KEYS[1]) == ARGV[1] then
return redis.call('del', KEYS[1])
else
return 0
endHowever, the lock may still expire while the business logic is running.
Redisson Framework
Redisson continuously extends the lock expiration while the protected method is executing, preventing premature expiration.
Redlock Algorithm
For Redis clusters, the author antirez proposed the Redlock algorithm. The process:
Get current Unix time in milliseconds.
Try to acquire the lock on N Redis instances using the same key and a random value, with a short client timeout (< 5‑50 ms).
If a majority (N/2+1) of instances grant the lock and the total acquisition time is less than the lock TTL, the lock is considered acquired.
The effective lock time is reduced by the acquisition duration.
If acquisition fails, release the lock on all instances.
Redlock ensures safety in a distributed environment.
Redis Cluster
Three Cluster Modes
In production, Redis is usually deployed in cluster mode because a single node has limited stability and storage.
Master‑Slave replication
Sentinel mode
Cluster mode
Master‑Slave Replication
One master handles writes; multiple slaves handle reads, achieving read‑write separation.
Replication process:
Slave connects to master and sends SYNC.
Master runs BGSAVE to create an RDB snapshot and buffers subsequent writes.
Master sends the snapshot to slaves while continuing to record writes.
Slaves load the snapshot, discard old data, and then apply buffered writes.
After initialization, each write on master is propagated to slaves.
Advantages
Supports read‑write separation.
Slaves can also serve other slaves, reducing master load.
Master continues serving clients during replication.
Disadvantages
No automatic failover; if master fails, writes are unavailable.
Manual recovery required.
Storage capacity limited to a single node's memory.
Sentinel Mode
Sentinel monitors master and slaves, performs automatic failover, and provides clients with the current master address.
Functions:
Monitoring: constantly checks health of master and slaves.
Automatic failover: promotes a slave to master when the current master fails.
Configuration provider: clients obtain master address from sentinel.
Notification: sentinel notifies clients of failover results.
Sentinel determines down status via subjective and objective checks. Objective down requires a majority of sentinels to agree.
During failover, a sentinel is elected as leader using a Raft‑like algorithm (majority votes). The leader selects a new master based on health, priority, and replication offset.
Sentinel Advantages
Provides high availability through monitoring and automatic failover.
Retains all benefits of master‑slave replication.
Sentinel Disadvantages
Storage capacity still limited to a single node's memory.
Cluster Mode
Cluster mode distributes data across multiple nodes using hash slots (16384 slots). Each key is hashed (CRC16) and assigned to a slot; each node owns a subset of slots.
Example with three nodes:
Node A: slots 0‑5500
Node B: slots 5501‑11000
Node C: slots 11001‑16384
Adding or removing nodes involves moving slots without downtime.
Redis Cluster Practical Demo
Environment: VMware VM, CentOS 7, Redis 6.0.6.
Master‑Slave Demo
Configuration files for master (6379) and slaves (6380, 6381) include replicaof 127.0.0.1 6379. Start servers, connect with redis-cli -p 6379, etc. Data written to master is replicated to slaves, as shown in screenshots.
Sentinel Demo
Three sentinel instances (26379‑26381) monitor the master. After shutting down the master, sentinel promotes slave 6380 to master. Restarting the original master makes it a slave of the new master.
Cluster Demo
Six nodes (7001‑7006) are configured with cluster-enabled yes and appropriate ports. After starting all nodes, the cluster is created with
redis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1. The resulting cluster has three masters and three slaves, each owning a range of hash slots.
Using redis-cli -p 7001 -c, keys are automatically routed to the appropriate node based on hash slots.
Redis Caching Issues
Redis is often used as a cache to alleviate database bottlenecks, but several problems can arise.
Cache penetration
Cache breakdown (stampede)
Cache avalanche
Cache pollution
Cache Penetration
Definition: Queries that miss both cache and database (e.g., malicious requests for non‑existent users). Solutions:
Gateway validation, authentication, blacklists, rate limiting.
Cache empty results with a short TTL (e.g., 60 s).
Use Bloom filters to filter out obviously invalid requests.
Cache Breakdown
Definition: When a hot key expires, many concurrent requests hit the database simultaneously. Solutions:
Rate limiting and circuit breaking.
Locking: the first request loads the data and populates the cache; others wait for the lock.
Cache Avalanche
Definition: A large number of keys expire at the same time, overwhelming the database. Solutions:
Set varied expiration times for keys.
Distribute hot data across multiple cache nodes.
Cache Pollution
Definition: Stale keys without expiration accumulate, consuming memory. Solutions:
Set appropriate TTLs for cached entries.
Use LRU (least recently used) eviction policy to remove old data.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
