Hidden Redis Pitfalls That Can Crash Your System
This article enumerates common Redis pitfalls—including unexpected key expiration loss, command‑induced blocking, memory‑intensive bitmap operations, AOF and RDB persistence issues, and master‑slave replication quirks—explaining their causes, real‑world impact, and practical mitigation steps.
Common Command Pitfalls
When using SET with an expiration, updating the key without re‑specifying the expiration removes the TTL, causing the key to become permanent. Example:
127.0.0.1:6379> SET testkey val1 EX 60
OK
127.0.0.1:6379> TTL testkey
(integer) 59
127.0.0.1:6379> SET testkey val2
OK
127.0.0.1:6379> TTL testkey // key never expires
(integer) -1Large keys that grow over time can exhaust memory if their TTL is unintentionally cleared.
DEL Blocking
The DEL command is O(1) for String keys but O(M) for List, Hash, Set, and ZSet where M is the number of elements. Deleting a big collection or a 500 MB string can block the Redis main thread, especially if the lazy‑free mechanism is enabled but the key is still freed in the main thread.
Check element count with LLEN, HLEN, SCARD, ZCARD.
If the count is small, delete directly; otherwise delete in batches using LRANGE / LPOP, HSCAN / HDEL, SSCAN / SREM, ZSCAN / ZREM.
RANDOMKEY Blocking
RANDOMKEYfirst checks whether the selected key is expired; if so, it lazily deletes it and continues searching. When many keys are expired, the loop can become very long, especially on a slave that does not actively delete expired keys, potentially causing a dead‑loop and full instance stall. Redis 5.0 introduced a retry‑limit (max 100 attempts) to avoid this.
SETBIT OOM
Using SETBIT on a non‑existent or tiny key with a very large offset forces Redis to allocate a huge bitmap, which can be slow and memory‑intensive. Example:
127.0.0.1:6379> SETBIT testkey 10 1
(integer) 1
127.0.0.1:6379> GETBIT testkey 10
(integer) 1Large offsets should be avoided.
MONITOR OOM
Running MONITOR in a high‑QPS environment streams every command to the client buffer, which can grow without bound and cause an out‑of‑memory (OOM) crash if the server lacks sufficient memory.
Data Persistence Pitfalls
Redis supports two persistence methods: RDB snapshots and AOF logs. Both can cause OOM or performance problems.
Master Crash Data Loss
In a master‑slave + Sentinel setup where the master has persistence disabled, a crash followed by automatic restart leaves the master empty. The slave, trying to stay consistent, also clears its data, resulting in total data loss and a cache‑snowball effect.
Do not rely on automatic process managers to restart a master without persistence.
Ensure Sentinel promotes a healthy slave before restarting the original master.
AOF everysec Blocking
Even with appendfsync everysec, if the background thread’s fsync blocks due to heavy disk I/O, the main thread may wait up to 2 seconds before writing to the AOF buffer, potentially blocking client writes and causing up to 2 seconds of data loss on crash.
RDB/AOF Rewrite OOM
During RDB snapshots or AOF rewrite, Redis forks a child process. The parent continues to accept writes, triggering copy‑on‑write (COW). High write rates combined with large data volumes cause massive memory duplication, which can exhaust RAM and trigger OOM.
Master‑Slave Replication Pitfalls
Redis replication is asynchronous, so a master crash can leave some writes unsynced to the slave, leading to data loss for use‑cases that treat Redis as a primary data store or distributed lock.
Expired‑Key Inconsistency
Before Redis 3.2, slaves returned values for expired keys because they never checked TTL. From 3.2 to 4.0.11, data‑retrieval commands returned NULL for expired keys, but EXISTS still reported the key as present. Redis 4.0.11 fixed this fully.
Machine‑Clock Skew
Expiration is evaluated against each node’s local clock. If a slave’s clock runs faster, it may consider keys expired earlier than the master, causing mismatched query results and, after a slave‑to‑master promotion, a massive cache‑snowball.
maxmemory Inconsistency
Different maxmemory settings on master and slave cause the slave to evict keys earlier, breaking consistency. Adjusting limits in the wrong order can exacerbate the issue. Redis 5.0 added replica-ignore-maxmemory (default yes) to prevent slaves from evicting independently.
Slave Memory Leak (pre‑4.0)
Writable slaves ( read‑only=no) with expiring keys could retain expired data indefinitely, leaking memory. This bug was fixed in Redis 4.0.
Replication Storm
When a large RDB is transferred, a small slave‑client‑output‑buffer‑limit can cause the master to drop the slave connection, forcing repeated full syncs and a “replication storm.” Mitigation: keep instances small, increase buffer limits, and ensure sufficient network bandwidth.
Conclusion
The article covered three major areas where Redis users commonly encounter pitfalls: command usage, data persistence, and master‑slave replication. Understanding these edge cases, checking version‑specific bugs, and applying the recommended configuration tweaks can prevent performance degradation, data loss, and service outages.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
