Databases 20 min read

Diagnosing and Solving Redis Performance Issues

This article explains how to detect Redis latency problems, measure baseline performance, monitor slow commands, and address common causes such as network round‑trip delays, fork‑generated RDB snapshots, transparent huge pages, swap usage, AOF settings, key expiration, and big‑key handling, providing practical troubleshooting steps and solutions.

IT Services Circle

Feb 24, 2022

Diagnosing and Solving Redis Performance Issues

Redis is often a crucial component in business systems for caching, login information, leaderboards, and more. When Redis request latency increases, it can cause a system "avalanche".

Is Redis Performing Poorly?

Maximum latency is the time from when a client sends a command to when it receives the response. Under normal conditions Redis processes commands in microseconds. If latency reaches seconds or even several milliseconds (depending on hardware), Redis is considered slow.

How do we define that Redis is really slow?

We need to measure the Redis baseline performance in a low‑load, interference‑free environment. When the observed latency is more than twice the baseline, we can conclude that Redis performance has degraded.

Baseline Latency Measurement

The redis-cli command provides the --intrinsic-latency option to monitor the maximum latency during a test (in milliseconds). Example: redis-cli --intrinsic-latency 100 Running the test for 100 seconds usually reveals latency spikes. In the example, the maximum latency observed was 3079 µs (≈3 ms), which becomes the baseline.

Run the test on the Redis server side (not the client) to avoid network influence. Use -h host -p port to connect directly.

Slow‑Command Monitoring

How do we determine if a command is slow?

Check the command’s time complexity; prefer O(1) or O(log N). Commands with O(N) complexity (e.g., HGETALL, SMEMBERS, SORT, LREM, SUNION) can become slow.

Two ways to locate slow commands:

Use Redis slow‑log (records commands exceeding a configurable threshold, default 10 ms).

Use the latency‑monitor tool introduced in Redis 2.8.13.

Example of configuring slow‑log to record commands slower than 6 ms:

redis-cli CONFIG SET slowlog-log-slower-than 6000

Retrieve the last two slow commands:

127.0.0.1:6381> SLOWLOG get 2

Latency Monitoring

Set a latency threshold (in ms) with CONFIG SET latency-monitor-threshold 9. Events exceeding the threshold are recorded and can be inspected with latency latest.

Network Communication Delay

Redis clients use TCP/IP or Unix domain sockets. A typical 1 Gbit/s network adds ~200 µs latency. The command flow is: send → queue → execute → return, known as Round‑Trip Time (RTT). Using pipelines reduces RTT for batch operations.

Delay Caused by Fork‑Generated RDB Snapshots

When generating an RDB snapshot, Redis forks a background process. The fork operation runs in the main thread and can introduce latency, especially with large memory pages (COW). Disabling transparent huge pages can mitigate this:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Swap (Operating‑System Paging)

If physical memory is insufficient, Redis pages may be swapped to disk, causing severe latency. To check swap usage, inspect /proc/<pid>/smaps for "Swap" fields. Non‑zero swap indicates memory pressure.

Typical mitigation steps:

Increase machine memory.

Run Redis on a dedicated machine.

Scale the cluster to reduce per‑instance data size.

AOF and Disk I/O Delay

Redis supports three appendfsync policies:

no : No fsync; only write to kernel buffer.

everysec : Fsynchronise once per second (default, may lose up to 1 s of data).

always : Fsynchronise on every write (high latency).

For cache use‑cases, no or everysec is recommended. Also set no-appendfsync-on-rewrite yes to avoid fsync during AOF rewrite.

Expiration (淘汰) of Keys

Redis evicts expired keys either lazily (on access) or actively (every 100 ms). Active eviction samples a configurable number of keys ( ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP, default 20) and deletes expired ones. Massive simultaneous expirations can cause noticeable latency spikes.

Solution: add a small random offset to the expiration time when using EXPIREAT or EXPIRE.

Big‑Key Issues

Keys with large values or many members (e.g., a 5 MB string, a list of 10 000 items, a hash with 10 MB total value) can cause OOM, uneven cluster memory distribution, bandwidth saturation, and blocking during deletion.

Detect big keys with tools like redis‑rdb‑tools. Mitigations:

Split large keys into multiple smaller keys.

Delete large keys asynchronously using UNLINK (available since Redis 4.0).

Summary Checklist

Measure current Redis baseline performance.

Enable slow‑command monitoring.

Identify and optimise slow commands (use SCAN ‑style iteration).

Keep instance data size around 2‑4 GB to avoid long RDB load times.

Disable transparent huge pages to prevent unnecessary memory copying.

Check for swap usage; eliminate if present.

Configure AOF appropriately and disable fsync during rewrite.

Avoid big keys or delete them with UNLINK.

Reference materials:

Redis Latency Documentation

Redis Latency Monitor

Redis Benchmarks

Redis SLOWLOG Command

Tencent Cloud Article on Redis Performance

WeChat Article on Redis Tuning

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring performance database Redis Latency troubleshooting

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.