Databases 20 min read

Deep Dive into Redis Cluster: Architecture, Sharding, Replication, and Failover

This article provides a comprehensive analysis of Redis Cluster, covering node and slot assignment, command execution, resharding, redirection, failover, gossip messaging, and communication overhead, while explaining why clustering is needed, how it works, and how to deploy and manage it effectively.

Sohu Tech Products

May 12, 2021

Deep Dive into Redis Cluster: Architecture, Sharding, Replication, and Failover

Why Use Redis Cluster

When a single Redis instance cannot handle large data volumes or high traffic, clustering solves storage bottlenecks, enables horizontal scaling, and provides automatic failover.

What Is a Redis Cluster

A Redis Cluster is a distributed database that shards data into 16,384 slots, each managed by one or more nodes. Nodes exchange state via the Gossip protocol, allowing every node to know the full slot‑to‑node mapping.

Cluster Installation

To create a working cluster, connect independent nodes using the CLUSTER MEET <ip> <port> command. This handshake adds the target node to the cluster.

CLUSTER MEET 192.168.1.10 6379

Implementation Principles

Data Sharding

Each key is hashed with CRC16, producing a 16‑bit value that is modulo‑ed by 16,384 to determine its slot. Optional hash tags can force a key into a specific slot.

Slot‑to‑Node Mapping

When a cluster is created (e.g., with cluster create), Redis automatically distributes the 16,384 slots evenly across all nodes. Administrators can also assign slots manually with cluster addslots.

redis-cli -h 172.16.19.1 -p 6379 cluster addslots 0-5460
redis-cli -h 172.16.19.2 -p 6379 cluster addslots 5461-10922
redis-cli -h 172.16.19.3 -p 6379 cluster addslots 10923-16383

Replication and Failover

Each master node can have one or more slaves that replicate its data. If a master fails, a slave is promoted to master. The cluster can be configured with cluster-require-full-coverage to allow partial availability when some nodes are down.

Failure Detection

Nodes use the Gossip protocol to broadcast their status. When a majority of nodes agree that a peer is unreachable (PFAIL), the cluster marks it as FAIL and initiates a failover.

Failover Process

A slave of the failed master is selected as the new master.

The new master claims the slots previously owned by the failed node.

It broadcasts a PONG message to inform the rest of the cluster.

Clients start sending commands to the new master.

Leader Election

The election follows a Raft‑like protocol: a configuration epoch is incremented, slaves request votes via CLUSTERMSG_TYPE_FAILOVER_AUTH_REQUEST, and a candidate becomes leader when it receives a majority of votes.

Client Slot Location

Clients compute the slot locally (CRC16 + modulo) and cache the slot‑to‑node map received from any node. When a request hits the wrong node, the server returns a redirection error.

MOVED Error

If the target slot belongs to another node, the server replies with MOVED, prompting the client to update its cache and retry the command on the correct node.

GET mykey
(error) MOVED 16330 172.17.18.2:6379

ASK Error

During a live migration, a node may return ASK, indicating the client should temporarily query the target node after sending an ASKING command. The client cache is not updated.

GET mykey
(error) ASK 16330 172.17.18.2:6379

Cluster Size Limits

Officially, Redis Cluster supports up to 1,000 nodes. The main limitation is the communication overhead of the gossip protocol, which exchanges slot bitmaps (≈12 KB per PING/PONG) among all nodes.

Gossip Message Structure

typedef struct {
    char nodename[CLUSTER_NAMELEN]; // 40 bytes
    uint32_t ping_sent;            // 4 bytes
    uint32_t pong_received;        // 4 bytes
    char ip[NET_IP_STR_LEN];       // 46 bytes
    uint16_t port;                 // 2 bytes
    uint16_t cport;                // 2 bytes
    uint16_t flags;                // 2 bytes
    uint32_t notused1;             // 4 bytes
} clusterMsgDataGossip;

Instance Communication Frequency

Each instance sends a PING to a randomly chosen peer every second (default 5 peers per second). If a node has not received a PONG for > cluster-node-timeout/2, it immediately pings that node. Adjusting cluster-node-timeout can reduce traffic but may delay fault detection.

Overall, the article walks through the full lifecycle of a Redis Cluster—from motivation and architecture to deployment, slot management, replication, failover, and performance considerations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

redis Replication Cluster failover Gossip

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.