Hidden Redis Pitfalls: Why Keys Lose Expiration, Commands Block, and Replication Fails
This article reveals the most common Redis pitfalls—including unexpected key expiration loss, O(1) commands that block or cause OOM, master‑slave synchronization quirks, and persistence bugs—while providing practical guidance to avoid each issue and keep your cache performant and reliable.
Hello, I'm San You. In this article I will discuss the "pitfalls" you may encounter when using Redis and how to avoid them.
Typical symptoms include keys that should expire but never do, SETBIT causing OOM, RANDOMKEY blocking, and master‑slave queries returning different results.
Common Command Pitfalls
Below are the most surprising behaviours of everyday Redis commands.
1) Expiration unexpectedly disappears
The SET command can erase a key's TTL if you omit the expiration parameter when updating the value.
127.0.0.1:6379> SET testkey val1 EX 60
OK
127.0.0.1:6379> TTL testkey
(integer) 59Updating the key without an expiration removes the TTL:
127.0.0.1:6379> SET testkey val2
OK
127.0.0.1:6379> TTL testkey // key never expires!
(integer) -1As a result the key becomes permanent and memory usage grows.
2) DEL can block Redis
Deleting a key is not always O(1). For non‑String types (List, Hash, Set, ZSet) the time complexity is O(M), where M is the number of elements.
Redis must free each element's memory, which can block the server when the key is large.
String key: O(1)
List/Hash/Set/ZSet key: O(M)
Best practice: check the element count with LLEN/HLEN/SCARD/ZCARD, then delete in batches using LRANGE/HSCAN/SSCAN/ZSCAN combined with LPOP/RPOP/HDEL/SREM/ZREM.
3) RANDOMKEY may block Redis
RANDOMKEY first checks whether the selected key is expired. If many keys are expired but not yet cleaned, the command may loop for a long time, especially on a slave that does not delete expired keys.
This can lead to a dead‑loop on the slave and even block the whole instance.
4) SETBIT O(1) can cause OOM
When setting a very large offset on a non‑existent or small key, Redis must allocate a much larger bitmap, which may exhaust memory.
127.0.0.1:6379> SETBIT testkey 10 1
(integer) 1
127.0.0.1:6379> GETBIT testkey 10
(integer) 15) MONITOR can trigger OOM under high QPS
MONITOR writes every command to the client output buffer. With high QPS and insufficient memory, the buffer grows until Redis is OOM.
Data Persistence Pitfalls
Redis offers RDB snapshots and AOF logs. Misconfiguration can cause data loss.
1) Master crash leads to total data loss
If the master is not persisting data and a supervisor restarts it after a crash, the slave will be cleared to stay consistent, resulting in complete data loss and a cache avalanche.
Do not let the process manager auto‑restart the master.
Use Sentinel to promote the slave after a master failure.
Restart the original master as a slave after promotion.
2) AOF everysec does not guarantee 1‑second safety
When the background fsync is blocked by heavy disk I/O, the main thread may wait up to 2 seconds before writing to the AOF buffer, so a crash can lose up to 2 seconds of data.
3) RDB/AOF rewrite may cause OOM
During snapshot or AOF rewrite Redis forks a child process. Writes continue in the parent using copy‑on‑write, which can double memory usage. On machines with limited RAM this can trigger OOM.
Replication Pitfalls
Redis replication is asynchronous, which introduces several consistency issues.
1) Data loss on master failure
If the master crashes before syncing recent writes, those writes are lost on the slave. This is acceptable for cache‑only use cases but not for persistence or distributed locks.
2) Inconsistent query results between master and slave
Before Redis 3.2, slaves returned values for expired keys. Redis 3.2‑4.0.11 fixed data queries but missed the EXISTS command, which still reported the key as existing. Redis 4.0.11 finally fixed this.
3) Clock skew between master and slave
Expiration is evaluated using each server's local clock. If the slave's clock runs faster, it may consider keys expired earlier, causing NULL responses while the master still returns values.
4) Maxmemory mismatch leads to data divergence
If master and slave have different maxmemory settings, the slave may start evicting keys earlier, causing inconsistency. Adjust maxmemory on the slave first when increasing, and on the master first when decreasing.
5) Slave memory leak in writable replicas (Redis < 4.0)
Writable slaves that store keys with expiration may retain those keys after they expire, leading to hidden memory consumption and no way to query them. This bug was fixed in Redis 4.0.
6) Replication storm during full‑sync
When a large RDB is transferred, the master’s replication buffer can overflow if the slave loads the file slowly, causing the master to drop the connection and the slave to retry, resulting in a replication storm.
Mitigation: keep instances small, increase the slave‑output‑buffer‑limit, and ensure sufficient network and disk performance.
Summary
This article covered three major areas where Redis can trip you up: command‑level pitfalls, persistence pitfalls, and replication pitfalls. Understanding these issues helps you configure Redis safely, avoid unexpected blocking, data loss, or memory exhaustion, and keep your services stable.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
