Redis Performance Tuning: How to Identify and Resolve Latency Issues
This article provides a thorough methodology for diagnosing Redis latency problems, covering baseline performance testing, analysis of slow commands, big key handling, expiration patterns, memory limits, fork overhead, AOF configuration, CPU binding, swap usage, fragmentation, network bandwidth, and offers concrete optimization techniques.
Redis Really Slowed Down?
Before troubleshooting, confirm whether Redis latency has actually increased. Use service-side tracing to isolate the problematic link between your application and Redis.
Baseline Performance Testing
Measure the maximum and average response latency of a healthy Redis instance in production. This baseline helps decide when latency is abnormal, as performance varies with hardware and configuration.
$ redis-cli -h 127.0.0.1 -p 6379 --intrinsic-latency 60The command reports the maximum latency observed within 60 seconds.
$ redis-cli -h 127.0.0.1 -p 6379 --latency-history -i 1Shows per‑second min/ max/ avg latency samples.
Step‑by‑Step Diagnosis
Test a normal Redis instance on the same hardware.
Test the suspect instance.
If the suspect latency is more than twice the baseline, consider it truly slow.
High‑Complexity Commands
Check the slowlog to find commands that take >5 ms.
# Set slowlog threshold to 5 ms and keep 500 entries
CONFIG SET slowlog-log-slower-than 5000
CONFIG SET slowlog-max-len 500Then retrieve recent entries: 127.0.0.1:6379> SLOWLOG get 5 Typical offenders are O(N) or higher commands such as SORT, SUNION, ZUNIONSTORE, or commands with a very large N. Reduce their usage or limit N (e.g., N <= 300).
Big‑Key Issues
Even simple SET / DEL can be slow if the key value is large. Scan for big keys:
$ redis-cli -h 127.0.0.1 -p 6379 --bigkeys -i 0.01Identify the largest key per data type and its memory share. Mitigations include avoiding big keys, using UNLINK (Redis 4.0+), or enabling lazy‑free eviction ( lazyfree-lazy-eviction yes).
Concentrated Expiration
Mass expirations at fixed times cause the active expiration task (run every 100 ms) to block the main thread. Detect by monitoring expired_keys in INFO. Solutions: add random jitter to expiration timestamps or enable lazy‑free expiration.
Memory Limit (maxmemory) Effects
When maxmemory is reached, Redis evicts keys before writing new data, adding latency. Choose an appropriate eviction policy (e.g., allkeys-lru or volatile-lru) and consider random eviction or splitting data across instances.
Fork Overhead (RDB/AOF Rewrite)
Background persistence forks a child process; copying page tables for large instances can block the server for seconds. Monitor latest_fork_usec via INFO. Mitigate by keeping instance size < 10 GB, scheduling RDB on slaves, or disabling AOF rewrite during peak load ( no-appendfsync-on-rewrite yes).
Transparent Huge Pages (THP)
THP can increase memory‑allocation latency during copy‑on‑write. Disable it:
$ echo never > /sys/kernel/mm/transparent_hugepage/enabledAOF Flush Policies
Three appendfsync modes affect performance and durability. always is slow; no is fast but unsafe; everysec is a trade‑off but can still block if disk I/O is saturated. Consider disabling AOF for pure cache workloads.
CPU Binding
Binding Redis to a single logical core can cause contention with background processes (fork, lazy‑free threads). If binding is required, bind to multiple cores on the same physical CPU, or use Redis 6.0’s server_cpulist, bio_cpulist, aof_rewrite_cpulist, and bgsave_cpulist settings.
Swap Usage
When Redis memory is swapped, latency spikes dramatically. Check swap with: $ cat /proc/<pid>/smaps | egrep '^(Size|Swap)' Prevent by adding RAM or freeing memory, and avoid swapping by keeping Redis on dedicated machines.
Memory Fragmentation
High mem_fragmentation_ratio (>1.5) indicates fragmentation. Redis 4.0+ can auto‑defrag:
activedefrag yes
active-defrag-threshold-lower 10
active-defrag-threshold-upper 100
active-defrag-cycle-min 1
active-defrag-cycle-max 25Test impact before enabling.
Network Bandwidth Saturation
Excessive traffic can cause packet loss and increased latency. Monitor network I/O and scale out or redistribute traffic when needed.
Other Factors
Prefer long connections over short ones to avoid TCP handshake overhead.
Ensure monitoring tools use long connections and reasonable polling intervals.
Run Redis on dedicated hardware to avoid resource contention.
Conclusion
The article outlines a comprehensive checklist for diagnosing Redis latency, spanning application‑level patterns, server configuration, OS mechanisms, and hardware resources, and provides concrete remediation steps for each identified cause.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
