Why Redis Cluster Slows Down After Scaling and How to Fix It
In a large‑scale Redis cluster, expanding nodes caused unexpected CPU spikes and higher latency for MGET operations, prompting a deep investigation that traced the issue to the CLUSTER SLOTS command and its handling of MOVED errors, followed by a code‑level optimization that reduced CPU usage by over 90% and cut command latency dramatically.
Background
Redis clusters are expanded as traffic grows. After a recent expansion of a >100‑node cluster, latency for real‑time reads increased.
Environment
Redis version 3.x/4.x (tested on 6.2.2 after fix)
Clients: Hiredis‑vip (C++) and Jedis (Java)
Cluster size: 100+ master nodes, no replicas
Symptoms
CPU usage spiked on nodes during the issue window while bandwidth and OPS remained normal. Monitoring showed that MGET OPS and CPU load were out of phase, but Cluster command execution correlated with CPU spikes.
Root‑cause analysis
Hotspot in Redis source
Using perf top on a high‑CPU node revealed that ClusterReplyMultiBulkSlots consumed ~52 % of CPU.
The original implementation iterates over every master node and over all 16384 slots, giving a time complexity of O(number_of_nodes × number_of_slots).
void clusterReplyMultiBulkSlots(client *c) {
int num_masters = 0;
void *slot_replylen = addDeferredMultiBulkLength(c);
dictEntry *de;
dictIterator *di = dictGetSafeIterator(server.cluster->nodes);
while((de = dictNext(di)) != NULL) {
clusterNode *node = dictGetVal(de);
if (!nodeIsMaster(node) || node->numslots == 0) continue;
for (int j = 0; j < CLUSTER_SLOTS; j++) {
int bit = clusterNodeGetSlotBit(node,j);
// build reply for each continuous slot range …
}
}
dictReleaseIterator(di);
setDeferredMultiBulkLength(c, slot_replylen, num_masters);
}Each slot is stored as a single bit in char slots[CLUSTER_SLOTS/8]. The helper clusterNodeGetSlotBit tests a bit with bitmapTestBit (pos/8, pos%8).
Client‑side MOVED handling
Both Hiredis‑vip and Jedis cache the slot topology. When a MOVED error occurs they refresh the topology by issuing CLUSTER SLOTS. This causes many concurrent CLUSTER SLOTS executions during large migrations, amplifying CPU load.
Hiredis‑vip: cluster_update_route_by_addr triggers the request.
Jedis: renewSlotCache performs the same refresh.
Optimization
Idea
Instead of iterating over every node for every slot, traverse server.cluster->slots directly. The array already maps each slot to its owning node, reducing complexity to O(number_of_slots).
New implementation
void clusterReplyMultiBulkSlots(client *c) {
int num_masters = 0, start = -1;
void *slot_replylen = addReplyDeferredLen(c);
clusterNode *n = NULL;
for (int i = 0; i <= CLUSTER_SLOTS; i++) {
if (n == NULL) {
if (i == CLUSTER_SLOTS) break;
n = server.cluster->slots[i];
start = i;
continue;
}
if (i == CLUSTER_SLOTS || n != server.cluster->slots[i]) {
addNodeReplyForClusterSlot(c, n, start, i-1);
num_masters++;
if (i == CLUSTER_SLOTS) break;
n = server.cluster->slots[i];
start = i;
}
}
setDeferredArrayLen(c, slot_replylen, num_masters);
}This groups consecutive slots belonging to the same node and emits a single reply per range.
Benchmark
Test environment: Manjaro 20.2, AMD Ryzen 7 4800H, 2 × 8 GB DDR4, 100 master nodes, no replicas. Benchmark used redis-benchmark to issue continuous CLUSTER SLOTS calls.
Before the change, ClusterReplyMultiBulkSlots accounted for ~51 % of CPU. After the change its share dropped to <1 % and latency fell from 2061 µs to 168 µs (≈8 % of original).
Conclusion
The latency increase after scaling a large Redis cluster was caused by the high cost of the CLUSTER SLOTS command, which is invoked frequently by clients handling MOVED errors. Refactoring clusterReplyMultiBulkSlots to iterate over server.cluster->slots reduces algorithmic complexity, cuts CPU usage by >90 % and lowers command latency dramatically. The fix was merged into Redis 6.2.2.
References
Hiredis‑vip: https://github.com
Jedis: https://github.com/redis/jedis
Redis source: https://github.com/redis/redis
perf tool: https://perf.wiki.kernel.org
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
