Avoid Redis Blocking: Dangerous Commands and How to Prevent Outages
In Redis's single‑threaded model, commands with O(n) or higher complexity—such as KEYS, HGETALL, LRANGE, SMEMBERS, and DEL on large keys—can block the server during traffic spikes, but using incremental scans, key splitting, and async deletion can eliminate the risk and keep services responsive.
Why Certain Redis Commands Block the Server
Redis processes all commands on a single thread, so the execution time of each command directly determines response latency. When a command’s time complexity is O(n) or higher and it operates on a large data set, the thread is occupied for a long period, causing the whole service to become unresponsive.
Commands That Can Cause Blocking
The following commands have O(n) or higher complexity and can block Redis when the underlying collection is large.
1. General Commands
KEYS *: Scans the entire key space; O(n) where n is the total number of keys. DEL on large keys: Deletes a key containing millions of elements; O(n) where n is the element count.
2. Hash Commands
HGETALL: Returns all fields and values; O(n) where n is the number of fields. HKEYS and HVALS: Return all fields or all values; also O(n).
3. List Commands
LRANGE key 0 -1: Returns the entire list; O(n) where n is the list length.
4. Set Commands
SMEMBERS, SUNION, SINTER, SDIFF: Iterate over all members; O(n) where n is the total number of elements.
5. Sorted Set Commands
ZRANGEand ZREVRANGE: Return all members; O(n) where n is the number of elements.
Real‑World Scenarios and Demo Scripts
Hash Example (Product Stock)
An e‑commerce platform stores product inventory in a hash product:stock. When the hash grows to 500 000 fields and an operator runs HGETALL product:stock, Redis blocks for about 3 seconds, causing order‑placement timeouts.
# Simulate a hash with 100 000 fields
redis-cli EVAL "for i=1,100000 do redis.call('HSET', 'product:stock', 'prod_'..i, math.random(1,1000)) end" 0
# Execute HGETALL and observe the block
redis-cli HGETALL product:stockList Example (Order Queue)
A message queue queue:order accumulates 300 000 order JSON strings. Executing LRANGE queue:order 0 -1 during a consumer outage blocks Redis for about 5 seconds, making all order‑related APIs fail.
# Simulate 300 000 order messages
redis-cli EVAL "for i=1,300000 do redis.call('LPUSH', 'queue:order', '{"order_id":"'..i..'","amount":'..math.random(100,10000)..'}') end" 0
# Retrieve all elements and observe the block
redis-cli LRANGE queue:order 0 -1Set Example (Course Registrations)
When a course has 150 000 registered users, SMEMBERS course:users:101 blocks Redis for about 2 seconds, causing registration and progress‑saving APIs to time out.
# Create a set with 150 000 members
redis-cli EVAL "for i=1,150000 do redis.call('SADD', 'course:users:101', 'user_'..i) end" 0
# Execute SMEMBERS and observe the block
redis-cli SMEMBERS course:users:101Mitigation Strategies
Incremental Scans: Replace full‑collection commands with SCAN, HSCAN, SSCAN, or ZSCAN, fetching a limited number of elements per call (e.g., COUNT 1000).
Key Splitting: Divide large hashes, lists, or sets into multiple smaller keys (e.g., user:info:1001:0, user:info:1001:1) and query each part separately.
Batch Retrieval: For lists, retrieve data in chunks using LLEN to get the length and then multiple LRANGE calls with a limited range (e.g., 0‑999, 1000‑1999).
Async Deletion: Use UNLINK (Redis 4.0+) to delete large keys asynchronously, or delete large lists incrementally with LTRIM until empty.
Slow‑Query Monitoring: Enable slowlog-log-slower-than 10000 to capture commands taking ≥10 ms, then analyze and optimize them.
Command Renaming: In redis.conf, rename or disable high‑risk commands (e.g., rename-command KEYS "", rename-command HGETALL "HGETALL_RESTRICTED") to prevent accidental use.
Best‑Practice Checklist
Audit data structures regularly; avoid storing millions of elements in a single key.
Prefer incremental scans over full scans for analytics or maintenance tasks.
Split large collections into sharded keys based on logical ranges.
Use UNLINK for bulk deletions; fall back to batch LTRIM if UNLINK is unavailable.
Monitor SLOWLOG and set alerts for commands exceeding the latency threshold.
Restrict or rename dangerous commands in production configurations.
By applying these techniques, teams can significantly reduce the probability of Redis‑induced outages and maintain high availability for latency‑sensitive services.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
