Databases 18 min read

How to Diagnose and Resolve Redis Performance Issues

This article explains how to identify Redis latency problems, measure baseline performance, monitor slow commands, and address common causes such as network RTT, forked RDB snapshots, transparent huge pages, swap usage, AOF configuration, key expiration bursts, and big keys, providing practical solutions and a checklist for remediation.

Sohu Tech Products

Dec 7, 2022

Redis Performance Issues?

Redis is a critical component in many systems for caching, session storage, leaderboards, etc. When request latency spikes, the whole business can suffer a "snowball" effect.

Baseline Latency Measurement

Use redis-cli --intrinsic-latency to record the maximum latency in milliseconds. Example:

redis-cli --intrinsic-latency 100
Max latency so far: 4 microseconds.
... (output truncated) ...
45026981 total runs (avg latency: 2.2209 microseconds / 2220.89 nanoseconds per run).
Worst run took 1386x longer than the average latency.

Run the test on the Redis server (not the client) to avoid network influence. A baseline of ~3 ms was observed.

Slow Command Monitoring

Identify commands with high complexity (O(N)) using the slowlog feature or the latency‑monitor tool. Slowlog records commands exceeding a configurable threshold (default 10 ms).

redis-cli CONFIG SET slowlog-log-slower-than 6000

View recent slow commands:

127.0.0.1:6381> SLOWLOG get 2
1) 1) (integer) 6
   2) (integer) 1458734263
   3) (integer) 74372
   4) 1) "hgetall"
      2) "max.dsp.blacklist"
...

Latency Monitoring (Redis >= 2.8.13)

Set a latency threshold (e.g., 9 ms) to record events: CONFIG SET latency-monitor-threshold 9 Check recent latency events with latency latest.

Network‑Induced Latency

Each command goes through: send → queue → execute → reply. The round‑trip time (RTT) can be reduced with pipelining or batch commands (MGET/MSET).

Slow Commands

Move O(N) operations to replicas or the client, replace them with O(1) or O(log N) alternatives, and avoid the KEYS command in production.

Fork‑Generated RDB Snapshots

Generating RDB snapshots forks a background process, which blocks the main thread and uses copy‑on‑write (COW). Large instances can suffer noticeable pauses.

Transparent Huge Pages (THP)

THP allocates 2 MB pages; during RDB generation, even tiny writes cause a full 2 MB copy, increasing latency. Disable with:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Swap (OS Paging)

When Redis memory exceeds physical RAM or other processes consume memory, the kernel swaps Redis pages to disk, causing severe latency. Check swap usage via /proc/<pid>/smaps:

$ cat smaps | egrep '^(Swap|Size)'
Size:        720896 kB
Swap:          12 kB
...

If swap values are large, increase RAM, isolate Redis on its own machine, or add more cluster nodes.

AOF and Disk I/O

Configure appendfsync to balance durability and performance: no: no fsync, fastest. everysec: fsync every second, acceptable loss of 1 s. always: fsync on every write, very slow.

Set no-appendfsync-on-rewrite yes to avoid fsync during AOF rewrite.

Expires (Key Expiration) Overload

Redis deletes expired keys lazily or via a periodic active‑expire cycle. When many keys expire simultaneously, the active cycle can block the server.

Mitigation: add a small random jitter to EXPIREAT or EXPIRE timestamps.

Big Keys

Keys with large values or many members (e.g., 5 MB strings, lists of 10 k items, hashes with 10 MB total) cause OOM, uneven cluster memory, and blocking deletions.

Detect big keys with tools like redis‑rdb‑tools, split them into smaller keys, or delete asynchronously using UNLINK (available since Redis 4.0).

Summary Checklist

Measure current Redis baseline latency.

Enable slow‑command monitoring.

Identify and rewrite slow commands (use SCAN, avoid KEYS).

Keep instance size 2‑4 GB to avoid long RDB loads.

Disable transparent huge pages.

Ensure swap usage is minimal; increase RAM if needed.

Set AOF configuration to no or everysec and enable no-appendfsync-on-rewrite.

Stagger key expirations to prevent active‑expire spikes.

Detect and split big keys; delete them with UNLINK.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization database redis Latency troubleshooting

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.