Inside Our High‑Performance Self‑Built Redis System: Architecture, Features & Ops
This article details the design and implementation of a self‑managed Redis KV cache system spanning tens of terabytes, covering its Proxy‑based architecture, ConfigServer high‑availability via Raft, Redis‑Proxy slot routing, async‑fork optimizations, data migration strategies, and a comprehensive automation platform for deployment, scaling, monitoring, and stability governance.
Overview
The self‑built Redis system, developed by the DBA team, manages dozens of terabytes of memory across hundreds of Redis clusters and tens of thousands of nodes, including several clusters with over 1 TB per node. It adopts a Proxy architecture composed of ConfigServer, Redis‑Proxy, and Redis‑Server, complemented by an automation platform for instance deployment, resource management, diagnostics, and analysis.
Core Components
ConfigServer
ConfigServer is a critical component deployed across multiple availability zones using the Raft consensus protocol to ensure high availability. Its responsibilities include:
Adding/removing proxies, groups, Redis‑Server instances, handling manual master‑slave switches, horizontal scaling, and data migration.
Updating Redis‑Proxy with topology changes.
Detecting Redis‑Server failures and performing automatic failover.
Each Redis cluster runs an independent ConfigServer group with at least three nodes spread across three zones, guaranteeing both ConfigServer and Redis‑Server high availability. Unlike open‑source Redis Sentinel (single‑threaded C implementation), ConfigServer leverages Go's goroutines to run a separate routine per group, enabling concurrent health checks.
Failure Detection : ConfigServer periodically sends PING and INFO commands to each Redis‑Server. If a command times out, the node is marked subjectively offline and the state is propagated. When a majority of ConfigServers mark a node as subjectively offline, the leader marks it objectively offline and initiates failover.
Failover Process selects the optimal slave based on:
Excluding unhealthy slaves (subjective/objective offline).
Highest slave‑priority.
Largest replication offset.
Smallest runid.
The chosen slave is promoted with SLAVEOF NO ONE, other slaves are re‑attached, and the old master is monitored for possible reintegration.
Redis‑Proxy
Redis‑Proxy acts as a stateless gateway, exposing a single Redis endpoint while internally routing commands to the appropriate Redis‑Server based on slot calculation: slot(key) = crc32(key) % 1024 Clients compute the slot, the Proxy looks up the responsible Redis‑Server, forwards the command, and returns the result. Optimizations reduced temporary object allocations by ~20×, cutting GC pressure and improving QPS by ~10 % in short‑connection scenarios.
Same‑City Active‑Active enables read‑writes in multiple zones: writes are directed to the local master, reads prefer the nearest replica, falling back to remote zones when needed. Configuration can be driven by container ServiceName or cloud provider PrivateZone DNS.
Asynchronous Dual‑Write supports simultaneous writes to cloud Redis and the self‑built Redis during migration, ensuring real‑time consistency and enabling seamless rollbacks.
Redis‑Server
Built on open‑source Redis, Redis‑Server adds slot synchronization, async migration, and an async‑fork feature. The cluster follows a Share‑Nothing design: each Group contains one master and multiple slaves, with no inter‑Group communication. All 1024 slots are evenly distributed among Groups.
Async‑Fork reduces fork latency to ~200 µs regardless of data size, keeping TP100 latency at 1‑2 ms and cutting fork time by 98 % compared to vanilla Redis, eliminating performance spikes during AOF rewrite, RDB snapshot, or full sync.
Data Migration
Horizontal scaling requires rebalancing slots. Two migration modes are provided:
Synchronous migration : the source MIGRATE blocks until the target loads data and acknowledges success, which can stall other operations due to Redis's single‑threaded nature.
Asynchronous migration : the source serializes data with DUMP, sends it asynchronously, returns success immediately, and the target restores with RESTORE. An ACK from the target triggers key deletion on the source, minimizing impact on live traffic.
Automation Operations Platform
The platform comprises Redis‑Admin, Kv‑Admin, Kv‑Agent, and monitoring components (APM/Prometheus). Its architecture is illustrated below:
Redis‑Admin provides a UI for all operational tasks such as instance deployment, scaling, and data migration.
Kv‑Admin handles request scheduling, machine recommendation, port allocation, SLB recommendation, instance listing, offline data analysis, and resource reporting.
Kv‑Agent runs on each ECS, executing deployment, start/stop actions, and exposing Redis‑Server metrics via an Exporter for Prometheus scraping.
Instance Deployment
Deployment automates package version selection, zone and spec configuration, and orchestrates the provisioning of ConfigServer, Redis‑Proxy, and Redis‑Server components, including machine recommendation, config generation, and service startup.
Key deployment rules ensure high availability:
ConfigServer spans three zones.
Redis‑Server and Proxy nodes of the same cluster are placed on different ECS instances.
Each ECS reserves at most 90 % of its memory for Redis workloads.
Resource recommendation prefers ECS with the most free memory.
Slot distribution is balanced per Group count.
SLB with the highest recent traffic is auto‑bound.
Scaling
Vertical scaling adjusts the maxmemory of a single node, preferred for nodes < 4 GB due to its simplicity and zero impact.
Horizontal scaling adds Groups (thus more master‑slave pairs), rebalances slots, and triggers automated data migration with visual progress tracking.
Resource Management
The platform tracks ECS memory usage, recommends machines with abundant free memory, and enforces per‑instance resource caps. It also provides utilization reports and supports tagging for isolation (e.g., high‑traffic, special‑requirement pools).
Monitoring & Alerting
Metrics collected include:
ECS: CPU, load, memory, network, packet loss, disk I/O.
Proxy: QPS, TP999/TP9999/TP100, connections, CPU, memory, GC count, goroutine count.
Server: CPU, memory, network, connections, QPS, key count, hit rate, response time.
Alert rules cover resource saturation, node failures, master‑slave inconsistencies, and SLB traffic spikes.
Stability Governance
To improve reliability, the system employs:
Resource Isolation : Tag‑based segregation of ECS pools to prevent noisy‑neighbor effects.
Automated Inspection : Periodic checks for missing slaves, zone mismatches, and configuration drifts, with a dashboard displaying instance health scores based on weighted metrics (CPU, memory, RT, traffic, etc.).
Fault Drills : Simulated failures across components, networks, and disks. Measured recovery times are ~12 s for a Redis‑Server master failover and ≤5 s for a Proxy outage.
Conclusion & Future Work
The article presented the full architecture, key features, and operational tooling of the self‑built Redis system. Ongoing improvements include upgrading Redis‑Server to the latest community version (e.g., 7.0), adding hot‑key statistics with local caching, and rewriting the high‑QPS Proxy in Rust to further reduce GC overhead.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
