How to Identify and Mitigate Redis Big‑Key Issues with Python and Shell Scripts
This guide explains why recent Redis versions require Python, shows how to install necessary dependencies, demonstrates using redis-cli --bigkeys and custom shell scripts to locate large keys, and outlines the risks and practical solutions for handling big keys in production.
Redis users often encounter "big key" problems that can degrade performance and increase memory fragmentation. The author, a seasoned DBA, notes that newer Redis releases suggest installing Python because the test suite and modules like RedisJSON, RedisSearch, and RedisTimeSeries rely on Python‑generated glue code and build scripts.
To install Redis with all required dependencies on a Fedora‑based system, run:
sudo dnf install -y --nobest --skip-broken \
pkg-config \
wget \
gcc-toolset-13-gcc \
gcc-toolset-13-gcc-c++ \
git \
make \
openssl openssl-devel \
python3.11 python3.11-pip python3.11-devel \
unzip rsync clang curl libtool automake autoconf jq systemd-develAfter installation, you can create a large key for testing:
head -c 10485760 /dev/zero | tr '\0' 'a' | redis-cli -x set bigstringTo discover big keys, use the built‑in redis-cli --bigkeys command, which samples the keyspace and reports the largest key per data type. Example output shows a 10 MB string named "bigstring".
For more thorough scanning, a custom Bash script ( find_bigkeys.sh) can be used. It connects to a Redis instance, iteratively SCANs keys in batches of 1000, retrieves each key's memory usage with MEMORY USAGE, stores size‑key pairs in a temporary file, and finally sorts and displays the top 20 biggest keys:
#!/bin/bash
REDIS_CLI="redis-cli"
HOST=${1:-127.0.0.1}
PORT=${2:-6379}
AUTH=$3
if [ -n "$AUTH" ]; then AUTH_OPT="-a $AUTH"; else AUTH_OPT=""; fi
echo "Scanning Redis ($HOST:$PORT) for big keys..."
cursor=0
tmpfile=$(mktemp)
while :; do
result=$($REDIS_CLI -h $HOST -p $PORT $AUTH_OPT --raw SCAN $cursor COUNT 1000)
cursor=$(echo "$result" | head -1)
keys=$(echo "$result" | tail -n +2)
for k in $keys; do
size=$($REDIS_CLI -h $HOST -p $PORT $AUTH_OPT MEMORY USAGE "$k" 2>/dev/null)
[ -n "$size" ] && echo -e "$size\t$k" >> $tmpfile
done
[ "$cursor" = "0" ] && break
done
echo "Top 20 big keys:"
echo "----------------------------------------------"
sort -nr $tmpfile | head -20 | awk '{printf "%-12s %-10s
", $1, $2}'
echo "----------------------------------------------"
echo "Units: bytes (divide by 1048576 for MB)"
rm -f $tmpfileBig keys pose several risks: they block the single‑threaded Redis event loop during GET/HGETALL/DEL operations, cause memory fragmentation, enlarge RDB/AOF files, and increase network transfer load, potentially leading to client timeouts.
Common causes include storing entire logs, aggregating per‑customer data of varying size, or caching whole configuration tables in a single key. Mitigation strategies are to shard data by time or category, avoid writing continuously to one key, regularly run the detection script, and delete large keys asynchronously using UNLINK instead of DEL.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
