Databases 5 min read

Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats

A Monday morning surge in user traffic exposed a Redis performance crisis, where CPU spiked to 100% due to massive keys* commands, and the investigation using Grafana, Redis info, commandstats, and slowlog revealed the root cause and a temporary mitigation strategy.

Open Source Linux

Sep 27, 2021

Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats

Web Monitoring

Using Alibaba Grafana we saw normal CPU, memory, and network, so the problem was Redis.

Our single‑node 32M 16GB Alibaba Cloud Redis showed CPU spiking to 100%.

QPS rose from ~1k to 6k, connections from 0 to 3k, but still far below limits; the latency was caused by a massive command queue.

Temporary solution: provision a new Redis instance and switch the application configuration.

Server Command Monitoring

Running info and checking slowlog revealed that the top ten slow commands were keys *, which blocks the service under current traffic.

Further inspection of command statistics showed extremely high average latencies for commands such as setnx (6 s), setex (7.33 s), del (69 s), hmset (64 s), hmget (9 s), hgetall (205 s), and especially keys (3740 s).

These latencies correlate with the size of the values, so recent data growth or code changes that issue these commands should be investigated.

Command statistics can be viewed via info commandstats, which reports calls, usec, and usec_per_call.

cmdstat_XXX: calls=XXX,usec=XXX,usec_per_call=XXX

The slowlog records commands taking longer than 10 ms (excluding network I/O). Example output:

xxxxx> slowlog get 10
 3) 1) (integer) 411
    2) (integer) 1545386469
    3) (integer) 232663
    4) 1) "keys"
       2) "mecury:*"

Fields represent log ID, timestamp, execution time (µs), and the command array.

Thus a sudden surge of keys * commands caused the CPU spike and latency. The command was not intended to be exposed by our application.

After sharing the stats with the development team, we discovered another application had mistakenly pointed to our Redis and was crawling data with massive keys * calls. The configuration was corrected.

Summary

Check web monitoring dashboards first.

Inspect Redis command stats and slowlog to identify heavy commands.

Optimize Redis usage in code.

Consider scaling Redis if traffic continues to grow.

Source: https://www.sevenyuan.cn

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance database Redis slowlog CommandStats

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.