Performance Analysis of Redis HSCAN vs HGETALL and Optimization Recommendations
This article examines why Redis HSCAN can cause high CPU usage on ziplist‑encoded hash objects, compares its performance with HGETALL through benchmark tests, and provides practical recommendations for avoiding costly full‑scan operations in production environments.
Author Introduction
Gao Wenjia joined Qunar.com DBA team in September 2021, responsible for database operation and maintenance of hotel and payment services, with several years of experience.
1. Scenario Description
A business line stored data in Redis HASH objects and used the HSCAN command to iterate all elements. After a stable period, the Redis instance’s CPU usage suddenly rose to 100% despite low QPS (<1000), causing increased response latency and service anomalies.
2. Problem Analysis
Slow‑log and command latency monitoring identified the culprit as the command HSCAN XXX 0 COUNT 100 . According to Redis documentation, when a HASH is encoded as a ziplist (elements < 2048 or value size < 3072 bytes), HSCAN ignores the COUNT argument and returns the entire collection, leading to poor performance.
Example inspection commands:
## 查看KEY的编码类型
redis 127.0.0.1:8662> DEBUG OBJECT "XXX_XXX_572761794"
Value at:0x7fd4aa9d73f0 refcount:1 encoding:ziplist serializedlength:27573 lru:3322719 lru_seconds_idle:103
## 查看KEY的元素个数
redis 127.0.0.1:8662> HLEN "XXX_XXX_572761794"
(integer) 1196
## 查看ziplist相关配置参数
redis 127.0.0.1:8662> CONFIG GET '*ziplist*'
1) "hash-max-ziplist-entries"
2) "2048"
3) "hash-max-ziplist-value"
4) "3072"
…Because the HASH in the example meets the ziplist criteria, HSCAN scans the whole object, consuming excessive CPU.
3. Source Code Study
In Redis 5.0, the hscanCommand function forwards to scanGenericCommand :
void hscanCommand(client *c) {
robj *o;
unsigned long cursor;
if (parseScanCursorOrReply(c,c->argv[2],&cursor) == C_ERR) return;
if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.emptyscan)) == NULL ||
checkType(c,o,OBJ_HASH)) return;
scanGenericCommand(c,o,cursor);
}The generic scanner handles different internal encodings. For ziplist‑encoded objects it sets the cursor to zero and iterates the entire collection, effectively performing a full scan.
void scanGenericCommand(client *c, robj *o, unsigned long cursor) {
/* Step 1: Parse options. */
/* Step 2: Iterate the collection. */
/* If the object is encoded with a ziplist, intset, or any other non‑hash‑table representation, we simply return everything in a single call, setting cursor to zero. */
if (ht) {
long maxiterations = count*10;
// hash table iteration using dictScan
} else if (o->type == OBJ_SET) {
// intset iteration
} else if (o->type == OBJ_HASH || o->type == OBJ_ZSET) {
unsigned char *p = ziplistIndex(o->ptr,0);
while(p) {
// extract element and add to result
p = ziplistNext(o->ptr,p);
}
cursor = 0;
} else {
serverPanic("Not handled encoding in SCAN.");
}
/* Step 3: Filter elements. */
/* Step 4: Reply to the client. */
}Similarly, genericHgetallCommand always returns the full hash:
void genericHgetallCommand(client *c, int flags) {
robj *o;
hashTypeIterator *hi;
// ... retrieve hash and iterate all fields, replying to client
}4. Performance Comparison
A test hash with 2000 elements was created, then 10 parallel processes repeatedly executed HGETALL and HSCAN (10000 iterations each). Monitoring showed:
HGETALL had significantly lower per‑request latency than HSCAN.
At 20 concurrent processes, both commands produced similar network bandwidth for the same QPS, but HGETALL consumed roughly half the CPU (33% vs 67%).
Graphs of CPU, QPS, and network traffic (omitted here) illustrated these findings.
5. Optimization Recommendations
For ziplist or intset encoded collections, HSCAN ignores the COUNT argument and scans the whole object, leading to high CPU usage. When full data retrieval is required, prefer HGETALL; when filtering is needed, HSCAN can reduce network traffic by performing server‑side filtering.
Consider adjusting Redis configuration to avoid ziplist encoding for large hashes (e.g., increase hash-max-ziplist-entries and hash-max-ziplist-value ) but be aware of the trade‑off with memory consumption.
In general, avoid frequent full‑scan commands on large collections. Evaluate business logic, use caching, data compression, or redesign data models to mitigate load on Redis.
End
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.