Why Is My Redis Slowing Down? A Complete Diagnosis and Optimization Guide
This article explains how to determine whether Redis is truly experiencing latency spikes, outlines a step‑by‑step benchmarking process, identifies common causes such as high‑complexity commands, big keys, memory limits, fork overhead, AOF settings, CPU binding, swap usage, and provides concrete configuration and code examples to resolve each issue.
Redis is a high‑performance in‑memory database, but latency can increase unexpectedly. This guide walks you through a systematic approach to verify whether Redis is really slow, benchmark its baseline performance, and pinpoint the root causes.
Verify Redis Slowdown
First, confirm the problem by tracing the request path with distributed tracing and checking whether the delay originates from the service‑to‑Redis link. If the network is fine, focus on Redis itself.
Benchmark Redis Performance
Run intrinsic latency tests on the Redis server to obtain the maximum latency over a period, e.g.:
<code>$ redis-cli -h 127.0.0.1 -p 6379 --intrinsic-latency 60</code>Use --latency-history to see min, max, and average latency over time.
Compare the observed latency with the baseline for your hardware; only when the latency exceeds the baseline by a significant factor (e.g., >2×) should you consider Redis to be slow.
Common Causes Inside Redis
Network problems between the application server and Redis (rare).
Redis‑internal issues, which are the focus of this guide.
High‑Complexity Commands
Check the slowlog after setting a low threshold (e.g., 5 ms) and a reasonable length:
<code># Record commands slower than 5 ms
CONFIG SET slowlog-log-slower-than 5000
# Keep the latest 500 entries
CONFIG SET slowlog-max-len 500</code>Typical offenders are O(N) or higher commands such as SORT , SUNION , ZUNIONSTORE , or O(N) commands on very large N. These increase CPU usage and can block subsequent requests.
Optimization: avoid such commands in hot paths, keep N ≤ 300, and move aggregation logic to the client.
Big Keys
Large values cause slow memory allocation and deallocation. Scan for big keys with:
<code>$ redis-cli -h 127.0.0.1 -p 6379 --bigkeys -i 0.01</code>When big keys are found, reduce their size or split the data. For Redis 4.0+ use UNLINK instead of DEL , or enable lazy‑free ( lazyfree-lazy-user-del yes ) to free memory in background.
Concentrated Expiration
If many keys expire at the same timestamp, the active expiration task (running in the main thread) can block client requests. Identify such patterns by searching for EXPIREAT / PEXPIREAT in the code and add a random offset to each key’s TTL.
<code># Randomly expire within 5 minutes after the intended time
redis.expireat(key, expire_time + random(300))</code>Alternatively, enable lazy‑free for expiration in Redis 4.0+ ( lazyfree-lazy-expire yes ).
Memory Limit (maxmemory)
When the instance reaches maxmemory , Redis evicts keys according to the configured policy (e.g., allkeys-lru , volatile-lru ). Eviction adds latency, especially for big keys. Choose a policy that fits your workload and consider reducing the memory pressure.
Fork Overhead (RDB/AOF Rewrite)
Background persistence creates a child process via fork . Large instances incur long fork times, blocking the main thread. Monitor latest_fork_usec via INFO . Mitigations include keeping the instance size < 10 GB, performing RDB on slaves, and avoiding fork during peak traffic.
Transparent Huge Pages (THP)
THP can increase memory‑allocation latency during write‑heavy periods. Disable it with:
<code># Check status
cat /sys/kernel/mm/transparent_hugepage/enabled
# Disable
echo never > /sys/kernel/mm/transparent_hugepage/enabled</code>AOF Configuration
Three fsync policies exist:
always : safest but highest latency.
no : fastest but risk of data loss.
everysec : balanced, but can still block if disk I/O is saturated.
During AOF rewrite, enable no-appendfsync-on-rewrite yes to temporarily skip fsync and reduce contention.
CPU Binding
Binding Redis to a single logical core can cause contention with background processes. If you must bind, use a set of cores on the same physical CPU, or configure Redis 6.0+ thread‑affinity settings ( server_cpulist , bio_cpulist , aof_rewrite_cpulist , etc.).
Swap Usage
Swap dramatically slows Redis. Check swap usage via /proc/<pid>/smaps . If significant swap is observed, add RAM or restart the instance after freeing memory.
Memory Fragmentation
Fragmentation ratio > 1.5 indicates > 50 % overhead. Redis 4.0+ can run activedefrag with tuned CPU limits, or you can restart the instance.
Network Saturation
High network traffic can cause packet loss and latency. Monitor bandwidth and consider scaling out or moving heavy‑traffic instances.
Other Operational Tips
Use persistent long connections instead of short connections.
Monitor Redis metrics (INFO) with appropriate intervals to avoid impacting performance.
Dedicate servers to Redis to avoid resource contention.
Conclusion
Redis performance issues span CPU, memory, disk, and network layers. Understanding command complexity, expiration policies, persistence mechanisms, OS features (THP, swap, lazy‑free), and proper monitoring enables you to quickly locate bottlenecks and apply targeted optimizations.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.