Databases 11 min read

Redis Clustering Techniques and Codis: Architecture, Performance Comparison, and Practical Tips

This article reviews common Redis clustering methods, compares Twemproxy and Codis, presents Codis’s architecture and performance test results, and offers migration, HA, pipeline, and operational guidance for using Codis as a Redis distributed middleware solution.

Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Redis Clustering Techniques and Codis: Architecture, Performance Comparison, and Practical Tips

1. Common Redis Clustering Techniques

Historically, Redis only supports a single instance with limited memory (10‑20 GB), which cannot meet the demands of large‑scale online services and leads to low resource utilization on servers with 100‑200 GB memory.

To overcome single‑node limitations, many internet companies have built self‑service clustering solutions that shard data across multiple Redis instances, each shard typically being a separate Redis instance.

Redis now offers an official Redis Cluster, and there are three main clustering mechanisms:

1.1 Client‑Side Sharding

This approach places the sharding logic in the application code, which routes requests to multiple Redis instances based on predefined routing rules. It gives developers full control and flexibility but requires manual adjustment when instances are added or removed, making operations harder and less suitable for small teams without strong DevOps support.

1.2 Proxy Sharding

In this model, a dedicated proxy program handles sharding. The proxy receives client requests, applies routing rules, forwards them to the appropriate Redis instances, and returns the responses. This reduces the burden on application code and simplifies operations, though it introduces a performance overhead. Twemproxy is a widely used open‑source example of this approach.

1.3 Redis Cluster

Redis Cluster is a decentralized solution without a central proxy. It maps all keys to 16 384 slots, distributes slots among cluster nodes, and lets the client automatically redirect requests to the correct node if the data is not on the initially contacted instance. While robust, it is more complex and currently sees limited adoption in production.

2. Twemproxy and Its Limitations

Twemproxy, an open‑source proxy sharding solution from Twitter, forwards client requests to backend Redis servers based on routing rules. It solves single‑node capacity issues but introduces a single point of failure, requiring external high‑availability solutions like Keepalived.

Key pain points of Twemproxy include difficulty with smooth scaling (adding or removing Redis nodes) and lack of an operational control panel, making it cumbersome for operators.

3. Codis Practice

Codis, open‑sourced by Wandou Labs in 2014, is a Go/C‑based Redis distributed middleware that addresses Twemproxy’s shortcomings and adds many useful features. Internal tests show that Codis’s stability meets high‑availability requirements and its performance has improved from being 20 % slower than Twemproxy to nearly 100 % faster under certain conditions.

3.1 Architecture

Codis introduces the concept of a Group, consisting of one Redis master and at least one slave. This enables seamless master‑failover via a dashboard without changing application configuration.

Codis modifies the Redis server source (Codis Server) to support hot data migration (auto‑rebalance). It uses a pre‑sharding scheme with 1 024 slots, stored in ZooKeeper, which also maintains group information and provides distributed locks.

3.2 Performance Comparison Tests

Three‑month benchmark tests using redis‑benchmark compared Codis and Twemproxy across value sizes from 16 B to 10 MB. Four physical servers were used, with separate deployments for Codis and Twemproxy clusters.

Results show that for Set operations with value length < 888 B, Codis outperforms Twemproxy, and for Get operations Codis consistently performs better. Graphs illustrate these findings.

3.3 Usage Tips and Precautions

Key practical tips include:

1) Seamless Migration from Twemproxy

Codis provides a Codis‑port tool to sync data from an existing Twemproxy setup to a Codis cluster, after which only the proxy address in the application configuration needs to be changed.

2) Java HA Support

Codis offers a Java client called Jodis that automatically detects and bypasses failed Codis proxies, ensuring high availability for Java applications.

3) Pipeline Support

Pipeline allows batching multiple requests, significantly boosting Set performance for values < 888 B and also improving Get performance, as shown in the benchmark graphs.

4) Codis Does Not Handle Master‑Slave Sync

Codis only manages the list of Redis servers; data consistency between master and slave must be ensured by operators, which keeps Codis lightweight and suitable for production.

5) Future Expectations

Users hope Codis will remain lightweight and improve pipeline performance for larger values, as current tests show a slowdown compared to Twemproxy for large payloads.

For more details, see the Codis source repository and documentation.

Backend DevelopmentShardingRedisperformance testingCodisDatabase ClusteringDistributed Middleware
Art of Distributed System Architecture Design
Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.