Databases 18 min read

Qunar Redis High‑Availability Architecture Design, Security Mechanisms, and Automated Operations

This article details the design principles, components, client implementation, data sharding, security mechanisms, high‑risk command blocking, configuration optimizations, and automated operational workflows of Qunar's Redis high‑availability cluster, including code modifications, deployment scripts, and platform‑based management for large‑scale production environments.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Qunar Redis High‑Availability Architecture Design, Security Mechanisms, and Automated Operations

Introduction

Cold Zhenglei joined Qunar's DBA team in February 2018, responsible for Redis and MySQL operations and development of automation platforms.

Qunar Redis High‑Availability Architecture Design Principles

Overview

The architecture consists of Redis server nodes (master‑slave pairs), a Zookeeper cluster for configuration change notifications, a Redis Sentinel cluster for failover, a MySQL‑based configuration center, and client applications that retrieve connection info from Zookeeper.

Client Implementation

Clients first obtain the configuration address from Zookeeper, then query the configuration center for connection details, and finally establish a Redis connection. Two threads are started: one monitors Zookeeper for changes, and the other polls the configuration center every 10 seconds to refresh connections.

Data Sharding Method

Sharding is based on MurmurHash2 over the 32‑bit key space, evenly distributing keys across N nodes. The configuration center stores shard information, including master instance and hash range.

Architecture Features

Custom Redis client bypasses Sentinel; Sentinel handles only HA.

Centralized configuration via Zookeeper and the configuration center.

Port resources are reusable after node removal.

Reduced Sentinel coupling and count (only five Sentinel nodes).

Namespace‑based client access simplifies DBA management.

Architecture Limitations

Limited client support (Java and Python only).

No fast horizontal scaling; expanding memory or node count requires re‑hashing keys.

Many dependent components increase failure risk and operational complexity.

Some native Redis commands (transactions, Lua scripts) are unavailable.

Qunar Redis Security Mechanism

Redis is modified to include a trustedip whitelist and a clientcipher for authentication. Only clients whose IP is in the whitelist can execute high‑risk commands.

Client Uses clientcipher and IP Whitelist

Clients authenticate using namespace and a generated clientcipher.

IP whitelist can hold up to 32 entries and is dynamically configurable.

Code Modifications

Added trustedIPArray struct in server.h and isTrustedIP function in networking.c to check client IPs. Updated createClient to set is_super_client based on whitelist status. Modified processCommand to enforce super‑client authentication. Implemented checkCommandBeforeExec in db.c to restrict dangerous commands.

Blocking High‑Risk Commands

High‑risk commands such as INFO, KEYS *, SHUTDOWN, FLUSHDB, SAVE, BGSAVE, CONFIG SET, SLAVEOF, etc., are wrapped with checkCommandBeforeExec checks to ensure only authorized super clients can execute them.

Configuration Optimization

Master nodes disable BGSAVE and BGREWRITEAOF, while slave nodes enable AOF, schedule periodic BGSAVE, and set slave‑read‑only to true.

Qunar Redis Automated Operations

System Environment Initialization

System parameters (e.g., vm.overcommit_memory=1, vm.swappiness=0, file descriptor limits) are set via sed and echo commands in the RPM spec file.

Unified Management Tools

Scripts for AOF switching, auto‑upgrade, BGSAVE, memory checks, RDB backups, and monitoring are placed under /etc/cron.d and integrated with Collectd and NRPE for health checks.

Single‑Machine Multi‑Instance Deployment

The installation package supports multiple Redis versions (2.8.6, 3.0.7, 4.0.14) and provides a redis_install.sh script with parameters for port, version, password, and memory size.

Git‑Managed Sentinel

Sentinel configuration files are version‑controlled with Git, using a naming convention {port}_redis_{namespace}.conf. Changes trigger updates to the configuration center and Zookeeper, and notifications are sent to DBA and the incident platform.

Platformized Operations

Cluster deployment and instance migration are automated through a web portal. Deployment selects idle servers, notifies stakeholders via Qtalk and email, and logs actions to the incident platform. Migration tasks (partial or full‑machine) are generated automatically and progress is tracked without manual intervention.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Automationhigh availabilityredisDatabase ArchitectureSecurity
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.