Databases 29 min read

Hidden Redis Pitfalls: Why Keys Lose Expiration, Commands Block, and Replication Fails

This article reveals the most common Redis pitfalls—including unexpected key expiration loss, O(1) commands that block or cause OOM, master‑slave synchronization quirks, and persistence bugs—while providing practical guidance to avoid each issue and keep your cache performant and reliable.

Sanyou's Java Diary
Sanyou's Java Diary
Sanyou's Java Diary
Hidden Redis Pitfalls: Why Keys Lose Expiration, Commands Block, and Replication Fails

Hello, I'm San You. In this article I will discuss the "pitfalls" you may encounter when using Redis and how to avoid them.

Typical symptoms include keys that should expire but never do, SETBIT causing OOM, RANDOMKEY blocking, and master‑slave queries returning different results.

Common Command Pitfalls

Below are the most surprising behaviours of everyday Redis commands.

1) Expiration unexpectedly disappears

The SET command can erase a key's TTL if you omit the expiration parameter when updating the value.

127.0.0.1:6379> SET testkey val1 EX 60
OK
127.0.0.1:6379> TTL testkey
(integer) 59

Updating the key without an expiration removes the TTL:

127.0.0.1:6379> SET testkey val2
OK
127.0.0.1:6379> TTL testkey  // key never expires!
(integer) -1

As a result the key becomes permanent and memory usage grows.

2) DEL can block Redis

Deleting a key is not always O(1). For non‑String types (List, Hash, Set, ZSet) the time complexity is O(M), where M is the number of elements.

Redis must free each element's memory, which can block the server when the key is large.

String key: O(1)

List/Hash/Set/ZSet key: O(M)

Best practice: check the element count with LLEN/HLEN/SCARD/ZCARD, then delete in batches using LRANGE/HSCAN/SSCAN/ZSCAN combined with LPOP/RPOP/HDEL/SREM/ZREM.

3) RANDOMKEY may block Redis

RANDOMKEY first checks whether the selected key is expired. If many keys are expired but not yet cleaned, the command may loop for a long time, especially on a slave that does not delete expired keys.

This can lead to a dead‑loop on the slave and even block the whole instance.

4) SETBIT O(1) can cause OOM

When setting a very large offset on a non‑existent or small key, Redis must allocate a much larger bitmap, which may exhaust memory.

127.0.0.1:6379> SETBIT testkey 10 1
(integer) 1
127.0.0.1:6379> GETBIT testkey 10
(integer) 1

5) MONITOR can trigger OOM under high QPS

MONITOR writes every command to the client output buffer. With high QPS and insufficient memory, the buffer grows until Redis is OOM.

Data Persistence Pitfalls

Redis offers RDB snapshots and AOF logs. Misconfiguration can cause data loss.

1) Master crash leads to total data loss

If the master is not persisting data and a supervisor restarts it after a crash, the slave will be cleared to stay consistent, resulting in complete data loss and a cache avalanche.

Do not let the process manager auto‑restart the master.

Use Sentinel to promote the slave after a master failure.

Restart the original master as a slave after promotion.

2) AOF everysec does not guarantee 1‑second safety

When the background fsync is blocked by heavy disk I/O, the main thread may wait up to 2 seconds before writing to the AOF buffer, so a crash can lose up to 2 seconds of data.

3) RDB/AOF rewrite may cause OOM

During snapshot or AOF rewrite Redis forks a child process. Writes continue in the parent using copy‑on‑write, which can double memory usage. On machines with limited RAM this can trigger OOM.

Replication Pitfalls

Redis replication is asynchronous, which introduces several consistency issues.

1) Data loss on master failure

If the master crashes before syncing recent writes, those writes are lost on the slave. This is acceptable for cache‑only use cases but not for persistence or distributed locks.

2) Inconsistent query results between master and slave

Before Redis 3.2, slaves returned values for expired keys. Redis 3.2‑4.0.11 fixed data queries but missed the EXISTS command, which still reported the key as existing. Redis 4.0.11 finally fixed this.

3) Clock skew between master and slave

Expiration is evaluated using each server's local clock. If the slave's clock runs faster, it may consider keys expired earlier, causing NULL responses while the master still returns values.

4) Maxmemory mismatch leads to data divergence

If master and slave have different maxmemory settings, the slave may start evicting keys earlier, causing inconsistency. Adjust maxmemory on the slave first when increasing, and on the master first when decreasing.

5) Slave memory leak in writable replicas (Redis < 4.0)

Writable slaves that store keys with expiration may retain those keys after they expire, leading to hidden memory consumption and no way to query them. This bug was fixed in Redis 4.0.

6) Replication storm during full‑sync

When a large RDB is transferred, the master’s replication buffer can overflow if the slave loads the file slowly, causing the master to drop the connection and the slave to retry, resulting in a replication storm.

Mitigation: keep instances small, increase the slave‑output‑buffer‑limit, and ensure sufficient network and disk performance.

Summary

This article covered three major areas where Redis can trip you up: command‑level pitfalls, persistence pitfalls, and replication pitfalls. Understanding these issues helps you configure Redis safely, avoid unexpected blocking, data loss, or memory exhaustion, and keep your services stable.

PerformancedatabaseRedisPersistencereplication
Sanyou's Java Diary
Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.