Databases 18 min read

Detecting and Resolving Redis Performance Bottlenecks

This guide explains how to identify when Redis is slow, measure baseline latency, monitor slow commands and latency, troubleshoot network, fork, huge pages, swap, AOF, expiration, and big keys, and provides a practical checklist of solutions.

dbaplus Community
dbaplus Community
dbaplus Community
Detecting and Resolving Redis Performance Bottlenecks

Introduction

When Redis latency spikes, client requests may time out (e.g., "Could not get a resource from the pool"), causing order failures, MySQL overload, and database crashes. Detecting and fixing Redis performance issues promptly is essential.

1. Baseline Latency Measurement

Baseline latency is the round‑trip time from client request to response under low load. Measure it with: redis-cli --latency -h HOST -p PORT or redis-cli --intrinsic-latency 100 Run the test for at least 100 seconds to capture spikes. The maximum observed latency (e.g., 3079 µs ≈ 3 ms) becomes the baseline. Consider Redis slow when current latency exceeds twice this value.

2. Slow Command Monitoring

Enable the slowlog to capture commands exceeding a configurable threshold (default 10 ms). Set the threshold based on the baseline, e.g.:

redis-cli CONFIG SET slowlog-log-slower-than 6000

to log commands slower than 6 ms. Retrieve entries with: redis-cli slowlog get 2 Each entry contains: ID, Unix timestamp, execution time (µs), and the command with arguments.

3. Latency Monitoring

Redis 2.8.13 introduced latency‑monitor. Configure a threshold in milliseconds (e.g., three times the baseline, 9 ms):

redis-cli CONFIG SET latency-monitor-threshold 9

View recent events with: redis-cli latency latest Events show name, timestamp, latency, and max latency.

4. Network Communication Delay

Network round‑trip time (RTT) adds latency. A 1 Gbit/s network typically has ~200 µs RTT. Commands that require multiple RTTs (e.g., many HGETALL calls) can be optimized with pipelining to reduce round trips.

Redis pipeline illustration
Redis pipeline illustration

5. Fork‑Generated RDB Snapshots

Creating RDB snapshots requires forking the process, which blocks the main thread and uses copy‑on‑write (COW). Large instances allocate significant page tables (e.g., a 24 GB instance needs ~48 MB). During bgsave, memory copying can cause noticeable latency, and the master cannot serve writes while replicas load the RDB.

RDB snapshot latency
RDB snapshot latency

6. Transparent Huge Pages (THP)

Linux THP allocates 2 MB pages. When Redis modifies a small amount of data during RDB generation, the entire 2 MB page is copied, increasing latency. Disable THP with:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

7. Swap (OS Paging)

If physical memory is insufficient, the kernel swaps out pages. Identify the Redis process ID: redis-cli info | grep process_id Then inspect /proc/PROCESS_ID/smaps for Size and Swap fields. Non‑zero swap indicates memory pressure that can degrade performance; large swap usage (hundreds of MB or GB) is a red flag.

8. AOF and Disk I/O

Redis persistence can be tuned via the appendfsync setting:

no – no fsync (fastest, risk of data loss)

everysec – fsync every second (default, acceptable for cache workloads)

always – fsync on every write (slow, high durability)

For cache use cases, set appendfsync to no or everysec. Reduce disk contention during AOF rewrite with:

redis-cli CONFIG SET no-appendfsync-on-rewrite yes

9. Expire Deletion

Redis evicts expired keys lazily (on access) or actively (every 100 ms). Active expiration samples a set number of keys; if >25 % are expired, a full scan runs, which can block the server.

10. Big Key Issues

Big keys (large strings, long lists, massive hashes, or ZSETs) can cause OOM, replication imbalance, bandwidth saturation, and blocking deletions. Detect them with tools like redis-rdb-tools. Mitigate by:

Splitting large hashes or lists into multiple smaller keys.

Using UNLINK for non‑blocking deletion.

Adding random jitter to expiration times to avoid mass expirations.

Checklist

Measure current Redis baseline latency.

Enable slowlog and latency‑monitor to locate slow commands.

Use SCAN (or SSCAN, HSCAN, ZSCAN) instead of blocking commands.

Keep instance size between 2‑4 GB to avoid long RDB loads.

Disable transparent huge pages.

Monitor swap usage and increase physical memory if needed.

Adjust AOF settings ( no-appendfsync-on-rewrite) to reduce disk I/O.

Handle big keys by splitting or using UNLINK.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseredistroubleshooting
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.