Databases 5 min read

Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats

A Monday morning surge in user traffic exposed a Redis performance crisis, where CPU spiked to 100% due to massive keys* commands, and the investigation using Grafana, Redis info, commandstats, and slowlog revealed the root cause and a temporary mitigation strategy.

Open Source Linux
Open Source Linux
Open Source Linux
Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats

Web Monitoring

Using Alibaba Grafana we saw normal CPU, memory, and network, so the problem was Redis.

Our single‑node 32M 16GB Alibaba Cloud Redis showed CPU spiking to 100%.

QPS rose from ~1k to 6k, connections from 0 to 3k, but still far below limits; the latency was caused by a massive command queue.

Temporary solution: provision a new Redis instance and switch the application configuration.

Server Command Monitoring

Running

info

and checking

slowlog

revealed that the top ten slow commands were

keys *

, which blocks the service under current traffic.

Further inspection of command statistics showed extremely high average latencies for commands such as

setnx

(6 s),

setex

(7.33 s),

del

(69 s),

hmset

(64 s),

hmget

(9 s),

hgetall

(205 s), and especially

keys

(3740 s).

These latencies correlate with the size of the values, so recent data growth or code changes that issue these commands should be investigated.

Command statistics can be viewed via

info commandstats

, which reports calls, usec, and usec_per_call.

cmdstat_XXX: calls=XXX,usec=XXX,usec_per_call=XXX

The

slowlog

records commands taking longer than 10 ms (excluding network I/O). Example output:

xxxxx> slowlog get 10
 3) 1) (integer) 411
    2) (integer) 1545386469
    3) (integer) 232663
    4) 1) "keys"
       2) "mecury:*"

Fields represent log ID, timestamp, execution time (µs), and the command array.

Thus a sudden surge of

keys *

commands caused the CPU spike and latency. The command was not intended to be exposed by our application.

After sharing the stats with the development team, we discovered another application had mistakenly pointed to our Redis and was crawling data with massive

keys *

calls. The configuration was corrected.

Summary

Check web monitoring dashboards first.

Inspect Redis command stats and slowlog to identify heavy commands.

Optimize Redis usage in code.

Consider scaling Redis if traffic continues to grow.

Source: https://www.sevenyuan.cn
monitoringPerformanceOperationsdatabaseRedisslowlogCommandStats
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.