Databases 24 min read

Common Redis Latency Issues: Diagnosis, Analysis, and Best Practices

This article explains why a high‑performance Redis instance can become slow, covering typical latency causes such as complex commands, big keys, concentrated expirations, memory limits, fork overhead, CPU binding, AOF settings, swap usage, and network saturation, and provides practical diagnosis steps and optimization recommendations for both developers and operators.

Architecture Digest
Architecture Digest
Architecture Digest
Common Redis Latency Issues: Diagnosis, Analysis, and Best Practices

Redis, an in‑memory database, can handle around 100k QPS per instance, but latency spikes frequently occur when the system is misused or poorly operated.

Redis Slowdown? Common Latency Issues and Analysis

Below are the typical scenarios that cause Redis latency and how to locate and analyze them.

High‑Complexity Commands

Check the slowlog to identify commands that exceed a latency threshold (e.g., 5 ms). Set the slowlog threshold and length:

# Command execution over 5 ms records slowlog
CONFIG SET slowlog-log-slower-than 5000

# Keep the most recent 1000 entries
CONFIG SET slowlog-max-len 1000

After enabling the slowlog, retrieve recent entries: 127.0.0.1:6379> SLOWLOG GET 5 If O(N) commands such as SORT, SUNION, ZUNIONSTORE are frequently logged, avoid them or reduce the data volume they process.

Big Keys

Large keys increase memory allocation and release time. Scan for big keys with: redis-cli -h $host -p $port --bigkeys -i 0.01 Control the scan frequency with the -i interval to limit QPS impact.

Concentrated Expiration

Mass expiration at a fixed time can block the main thread. Search code for EXPIREAT or PEXPIREAT and randomize the expiration time, e.g.:

# Randomize expiration within 5 minutes after the target time
redis.expireat(key, expire_time + random(300))

Monitor expired_keys via INFO and alert on sudden spikes.

Memory Limit Reached

When maxmemory is hit, Redis evicts keys before writes, which adds latency. Choose an appropriate eviction policy (e.g., allkeys‑lru or volatile‑lru) or split data across multiple instances.

Fork Overhead

RDB/AOF persistence and full‑sync generate a fork; large memory footprints make the fork expensive, blocking the server. Check latest_fork_usec in INFO and schedule persistence during low‑traffic periods or disable AOF if data loss is acceptable.

CPU Binding

Binding Redis to specific CPUs can cause contention with forked child processes during persistence, worsening latency. Avoid CPU pinning when using RDB/AOF.

AOF Configuration

Three fsync policies exist: always (high latency), everysec (recommended), and no (low safety). Use appendfsync everysec for a good balance.

Swap Usage

If the host swaps, Redis performance collapses. Detect swap usage, free memory, and restart the instance (preferably after a master‑slave switchover) to clear swap.

Network Saturation

High network load can cause packet loss and increased RTT, directly affecting Redis latency. Monitor NIC utilization and scale bandwidth or shard instances when needed.

Best Practices: Business and Operations Layers

Business Layer (Developers)

Avoid long keys and large values; enable lazy‑free for big values (Redis 4.0+).

Set appropriate TTLs and use the cache as intended.

Prefer low‑complexity commands; avoid SORT, SINTER, ZUNIONSTORE, etc.

Batch reads/writes with MGET/MSET, pipelines, and avoid KEYS (use SCAN instead).

Distribute expirations randomly to prevent spikes.

Choose suitable eviction policies; random eviction is often faster than LRU.

Use connection pools, stick to DB 0, and consider read‑write splitting or clustering for high traffic.

Operations Layer (DBAs)

Isolate business lines on separate instances and machines.

Provision sufficient CPU, memory, bandwidth, and disk; avoid swapping.

Deploy master‑slave clusters with Sentinel (≥3 nodes) for HA.

Plan capacity: keep instance memory ≤ ½ of host memory.

Monitor key metrics: expired_keys, evicted_keys, latest_fork_usec, slowlog, and network stats.

Set sensible slowlog threshold (≈10 ms) and replication buffers.

Perform backups on slaves, use AOF with everysec, and limit max connections.

By understanding Redis internals, command complexities, expiration strategies, persistence mechanisms, and resource constraints, developers and operators can jointly keep Redis performant and stable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

database
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.