Why Is My Redis Slowing Down? A Complete Guide to Diagnose and Fix Latency Issues
This comprehensive article walks you through the entire process of identifying why Redis latency spikes, from confirming the slowdown and measuring baseline performance to analyzing slow logs, big keys, expiration patterns, memory limits, fork overhead, AOF settings, CPU binding, swap usage, memory fragmentation, network bandwidth, and finally applying practical optimization techniques.
Redis is an in‑memory database with extremely high OPS, but latency spikes can still occur, breaking performance expectations.
Confirming the slowdown
First verify whether Redis is truly slower by measuring baseline latency on the Redis server itself. Use the intrinsic latency test: $ redis-cli -h 127.0.0.1 -p 6379 --intrinsic-latency 60 The output shows the maximum latency observed in the 60‑second window (e.g., 72 µs).
You can also view min/avg/max latency over time:
$ redis-cli -h 127.0.0.1 -p 6379 --latency-history -i 1Baseline performance comparison
Run the same test on a healthy instance with identical hardware. If the problematic instance’s latency is more than twice the baseline, it is considered slow.
Analyzing slow logs (high‑complexity commands)
Enable slow‑log with a 5 ms threshold and keep the latest 500 entries:
CONFIG SET slowlog-log-slower-than 5000
CONFIG SET slowlog-max-len 500Retrieve recent entries: 127.0.0.1:6379> SLOWLOG get 5 Typical culprits are O(N) or higher commands such as SORT, SUNION, ZUNIONSTORE, especially when N is large. High CPU usage on the Redis process often confirms this.
Big‑key investigation
If simple commands (SET/DEL) appear in the slow log, suspect big keys. Scan for them: $ redis-cli -h 127.0.0.1 -p 6379 --bigkeys -i 0.01 The summary lists the largest key per data type and the overall memory distribution.
Mitigation: avoid storing oversized values, keep N ≤ 300 for O(N) commands, or move aggregation to the client.
Concentrated expiration
Mass expiration at fixed timestamps (e.g., using EXPIREAT) can cause a burst of deletion work that blocks the main thread. Distribute expirations by adding a random offset: redis.expireat(key, expire_time + random(300)) Alternatively enable lazy‑free for deletions on Redis 4.0+:
lazyfree-lazy-expire yesMemory limit (maxmemory) pressure
When maxmemory is reached, each write triggers eviction of old keys according to the configured policy (e.g., allkeys‑lru, volatile‑lru, etc.). Evicting big keys is especially costly.
Fork overhead during persistence
Background RDB/AOF rewrites fork a child process. Fork copies the page table, which is expensive for large instances and blocks the main thread. Measure fork time with: INFO | grep latest_fork_usec If the value is high, consider reducing instance size (< 10 GB), disabling unnecessary persistence, or avoiding virtual machines.
Transparent huge pages
Huge pages (2 MB) increase allocation latency. Disable them if they are set to always:
# echo never > /sys/kernel/mm/transparent_hugepage/enabledAOF configuration impact
Three appendfsync modes affect performance: always – safest but highest latency. no – fastest but risk of data loss. everysec – balanced, yet a busy disk can still block writes.
During AOF rewrite you can temporarily disable fsync to avoid blocking:
no-appendfsync-on-rewrite yesCPU binding
Binding Redis to a single logical core can cause contention with background processes. If binding is required, bind the server and its I/O threads to a set of cores on the same physical CPU, and optionally bind child processes (RDB, AOF rewrite) to separate cores using Redis 6.0 settings:
server_cpulist 0-7:2
bio_cpulist 1,3
aof_rewrite_cpulist 8-11
bgsave_cpulist 1,10-11Swap usage
Check whether the Redis process is swapping:
# Find PID
ps -aux | grep redis-server
# Inspect swap
cat /proc/$PID/smaps | egrep '^(Swap|Size)'If large memory regions are swapped, increase RAM or free memory, then restart the instance.
Memory fragmentation
Fragmentation ratio > 1.5 indicates > 50 % overhead. Enable active defragmentation on Redis 4.0+:
activedefrag yes
active-defrag-ignore-bytes 100mb
active-defrag-threshold-lower 10
active-defrag-threshold-upper 100
active-defrag-cycle-min 1
active-defrag-cycle-max 25
active-defrag-max-scan-fields 1000Be aware that defragmentation runs in the main thread and can affect latency.
Network bandwidth saturation
When a Redis instance consumes the whole NIC bandwidth, packet loss and increased RTT appear. Monitor traffic and scale out or migrate instances if needed.
Other practical tips
Prefer long connections over frequent short ones to avoid TCP handshake overhead.
Use dedicated servers for Redis; avoid co‑locating CPU‑intensive or I/O‑heavy workloads.
Implement comprehensive monitoring (INFO metrics, expired_keys spikes, latency histograms) and set alerts for early detection.
Conclusion
The article provides a step‑by‑step checklist covering application‑level misuse, configuration pitfalls, OS‑level mechanisms, and hardware constraints, enabling developers and DBAs to pinpoint and resolve Redis latency problems efficiently.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
