Architecture and Design of Pika Native Distributed Cluster
The article explains the background, architecture, data distribution, processing flow, replication mechanisms, and management features of Pika's native distributed cluster, detailing how Etcd, LVS, and RocksDB are used to achieve scalable, persistent Redis-compatible storage with table isolation and flexible slot replication.
Background
Pika is a persistent, large-capacity Redis storage service compatible with most Redis interfaces (string, hash, list, zset, set). It addresses Redis's memory limitation by persisting data to disk, allowing migration without code changes. To meet growing demand for distributed clusters, a native distributed Pika cluster (v3.4) was released, eliminating the need for an additional Codis proxy.
Architecture
1. Cluster Deployment Structure
The example shows a three‑node Pika cluster. Deployment steps include:
Deploy an Etcd cluster as metadata storage for the Pika manager.
Install Pika manager on three physical machines, configure Etcd ports, and let the managers compete to become the leader; only one manager writes cluster data to Etcd.
Deploy Pika nodes on three machines and register them with the manager.
Register Pika service ports with LVS for load balancing.
2. Data Distribution
Pika introduces the concept of tables to isolate business data. Keys are hashed to slots; each slot has multiple replicas forming a replication group. One replica acts as the leader providing read/write services, while followers replicate data. The manager can migrate slots for load balancing and horizontal scaling.
3. Data Processing
When a Pika node receives a request, the parsing layer interprets the Redis protocol and passes the result to the router. The router determines the target slot based on the key hash. If the slot resides on another node, a task is created and forwarded to the peer; otherwise, the request is processed locally. Write requests are logged by the replication manager and asynchronously replicated to follower replicas; only the leader writes to the RocksDB instance.
4. Log Replication
Non‑consistent Replication
In this mode, the processing thread locks, writes the binlog, and updates the DB directly, then returns the response. An auxiliary thread asynchronously sends BinlogSync to followers, which acknowledge with BinlogSyncAck.
Consistent Replication (Raft)
The processing thread writes the request to the binlog, sends BinlogSync to followers, waits for acknowledgments from a majority, then writes to the DB and returns the response to the client.
5. Cluster Metadata Management
The Pika manager (PM), built on the Codis dashboard, serves as the global control node, storing metadata and routing information. New features include multi‑table creation, configurable slot and replica counts, table‑level password isolation, slot migration, integrated sentinel for health checks, and Etcd‑backed metadata for high availability. The manager achieves HA by competing for a lock in Etcd to become the leader.
Postscript
The native Pika cluster overcomes single‑node disk capacity limits and supports horizontal scaling, though current limitations include lack of automatic leader election via Raft, range‑based data distribution, and monitoring dashboards. Future releases aim to address these issues.
360 Tech Engineering
Official tech channel of 360, building the most professional technology aggregation platform for the brand.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.