Databases 34 min read

Why Redis Becomes Slow and How to Optimize It

This article explains the common reasons why Redis performance degrades—such as network latency, high‑complexity commands, big keys, concentrated expirations, memory limits, fork overhead, huge pages, AOF settings, CPU binding, swap usage, and memory fragmentation—and provides detailed optimization and troubleshooting steps to restore low latency.

Architect's Guide

Aug 12, 2023

Why Redis Becomes Slow and How to Optimize It

1. Why Redis Becomes Slow

Before judging slowness you must know the baseline latency of your Redis instance; a 2 ms delay on a low‑end machine may be normal, while on a high‑end server 0.5 ms could already be slow. Measure intrinsic latency with ./redis-cli --intrinsic-latency 120 and view latency history with redis-cli -h 127.0.0.1 -p 6379 --latency-history -i 1. If the observed latency is more than twice the baseline, the instance is considered slow.

Network latency can also affect performance; tools like iperf can be used to test the network bandwidth between the client and Redis server.

2. Common Causes of Slowdown

High‑complexity commands : O(N) or higher commands such as SORT, SUNION, ZUNIONSTORE consume excessive CPU or generate large responses.

Big keys (bigkey) : Large strings, lists, sets, hashes or sorted sets increase memory copy time and network transmission.

Concentrated expirations : Many keys expiring at the same moment trigger passive and active expiration logic, blocking the main thread.

Memory limit reached : When maxmemory is hit, Redis must evict keys; eviction policies (allkeys‑lru, volatile‑lru, etc.) add extra latency.

Fork overhead : RDB snapshots and AOF rewrites fork a child process; copying page tables for large instances can block the server for seconds.

Huge pages : Transparent huge pages cause 2 MB memory allocations even for tiny writes, increasing latency.

AOF configuration : Aggressive appendfsync settings or AOF rewrite contention can stall the main thread.

CPU binding : Binding Redis to a single logical core makes the forked child compete for CPU, worsening latency.

Swap usage : When Redis starts swapping, memory access slows down dramatically.

Memory fragmentation : High mem_fragmentation_ratio (>1.5) indicates inefficient memory usage and can degrade performance.

3. Optimization Strategies

Slow query optimization : Avoid O(N) commands, keep N ≤ 300, and perform aggregation on the client side.

Expiration randomization : Add a random offset to EXPIREAT to spread deletions, or enable lazyfree‑lazy‑expire (Redis 4.0+).

Bigkey handling : Scan and delete big keys asynchronously with UNLINK or FLUSHALL ASYNC.

Eviction policy tuning : Choose a suitable policy (e.g., allkeys‑lru or volatile‑lru) and consider random eviction for faster key removal.

Fork mitigation : Keep instance size < 10 GB, perform RDB snapshots during off‑peak hours, disable AOF if durability is not required, and avoid running Redis inside VMs.

CPU binding (Redis 6.0+) :

server_cpulist 0-7:2
bio_cpulist 1,3
aof_rewrite_cpulist 8-11
bgsave_cpulist 1,10-11

Huge page control : Disable transparent huge pages with echo never > /sys/kernel/mm/transparent_hugepage/enabled.

AOF rewrite optimization : Set no-appendfsync-on-rewrite yes to avoid fsync contention during rewrite.

Swap prevention : Increase physical memory or scale out with more Redis nodes; monitor swap usage via cat /proc/<pid>/smaps | egrep '^(Swap|Size)'.

Fragmentation cleanup : Enable automatic defragmentation ( activedefrag yes) and tune thresholds, or restart the instance for Redis <4.0.

4. Troubleshooting Checklist

Establish baseline latency for the current environment.

Identify and replace slow commands.

Randomize expiration times to avoid spikes.

Detect and handle big keys with async deletion.

Review AOF durability level; consider disabling or tuning.

Check memory usage and swap; add memory or shard data.

Disable transparent huge pages.

Limit data size per master in replication setups.

Bind Redis to appropriate CPU cores or sockets on NUMA machines.

By following these steps you can pinpoint the root cause of latency spikes and apply the appropriate configuration or architectural changes to keep Redis responsive.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Database Redis Latency Troubleshooting Memory

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.