Databases 19 min read

Why Redis Cluster Slows Down After Scaling and How to Fix It

In a large‑scale Redis cluster, expanding nodes caused unexpected CPU spikes and higher latency for MGET operations, prompting a deep investigation that traced the issue to the CLUSTER SLOTS command and its handling of MOVED errors, followed by a code‑level optimization that reduced CPU usage by over 90% and cut command latency dramatically.

dbaplus Community
dbaplus Community
dbaplus Community
Why Redis Cluster Slows Down After Scaling and How to Fix It

Background

Redis clusters are expanded as traffic grows. After a recent expansion of a >100‑node cluster, latency for real‑time reads increased.

Environment

Redis version 3.x/4.x (tested on 6.2.2 after fix)

Clients: Hiredis‑vip (C++) and Jedis (Java)

Cluster size: 100+ master nodes, no replicas

Symptoms

CPU usage spiked on nodes during the issue window while bandwidth and OPS remained normal. Monitoring showed that MGET OPS and CPU load were out of phase, but Cluster command execution correlated with CPU spikes.

Root‑cause analysis

Hotspot in Redis source

Using perf top on a high‑CPU node revealed that ClusterReplyMultiBulkSlots consumed ~52 % of CPU.

The original implementation iterates over every master node and over all 16384 slots, giving a time complexity of O(number_of_nodes × number_of_slots).

void clusterReplyMultiBulkSlots(client *c) {
    int num_masters = 0;
    void *slot_replylen = addDeferredMultiBulkLength(c);
    dictEntry *de;
    dictIterator *di = dictGetSafeIterator(server.cluster->nodes);
    while((de = dictNext(di)) != NULL) {
        clusterNode *node = dictGetVal(de);
        if (!nodeIsMaster(node) || node->numslots == 0) continue;
        for (int j = 0; j < CLUSTER_SLOTS; j++) {
            int bit = clusterNodeGetSlotBit(node,j);
            // build reply for each continuous slot range …
        }
    }
    dictReleaseIterator(di);
    setDeferredMultiBulkLength(c, slot_replylen, num_masters);
}

Each slot is stored as a single bit in char slots[CLUSTER_SLOTS/8]. The helper clusterNodeGetSlotBit tests a bit with bitmapTestBit (pos/8, pos%8).

Client‑side MOVED handling

Both Hiredis‑vip and Jedis cache the slot topology. When a MOVED error occurs they refresh the topology by issuing CLUSTER SLOTS. This causes many concurrent CLUSTER SLOTS executions during large migrations, amplifying CPU load.

Hiredis‑vip: cluster_update_route_by_addr triggers the request.

Jedis: renewSlotCache performs the same refresh.

Optimization

Idea

Instead of iterating over every node for every slot, traverse server.cluster->slots directly. The array already maps each slot to its owning node, reducing complexity to O(number_of_slots).

New implementation

void clusterReplyMultiBulkSlots(client *c) {
    int num_masters = 0, start = -1;
    void *slot_replylen = addReplyDeferredLen(c);
    clusterNode *n = NULL;
    for (int i = 0; i <= CLUSTER_SLOTS; i++) {
        if (n == NULL) {
            if (i == CLUSTER_SLOTS) break;
            n = server.cluster->slots[i];
            start = i;
            continue;
        }
        if (i == CLUSTER_SLOTS || n != server.cluster->slots[i]) {
            addNodeReplyForClusterSlot(c, n, start, i-1);
            num_masters++;
            if (i == CLUSTER_SLOTS) break;
            n = server.cluster->slots[i];
            start = i;
        }
    }
    setDeferredArrayLen(c, slot_replylen, num_masters);
}

This groups consecutive slots belonging to the same node and emits a single reply per range.

Benchmark

Test environment: Manjaro 20.2, AMD Ryzen 7 4800H, 2 × 8 GB DDR4, 100 master nodes, no replicas. Benchmark used redis-benchmark to issue continuous CLUSTER SLOTS calls.

Before the change, ClusterReplyMultiBulkSlots accounted for ~51 % of CPU. After the change its share dropped to <1 % and latency fell from 2061 µs to 168 µs (≈8 % of original).

Conclusion

The latency increase after scaling a large Redis cluster was caused by the high cost of the CLUSTER SLOTS command, which is invoked frequently by clients handling MOVED errors. Refactoring clusterReplyMultiBulkSlots to iterate over server.cluster->slots reduces algorithmic complexity, cuts CPU usage by >90 % and lowers command latency dramatically. The fix was merged into Redis 6.2.2.

References

Hiredis‑vip: https://github.com

Jedis: https://github.com/redis/jedis

Redis source: https://github.com/redis/redis

perf tool: https://perf.wiki.kernel.org

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceredisClusterCPUMOVEDCluster Slots
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.