Databases 16 min read

Achieving High Availability for MySQL & Redis on MaShang Cloud with Distributed Sentinel

This article explains MaShang Cloud's RDS high‑availability design, detailing the distributed sentinel monitoring system, proxy layer, multi‑AZ disaster‑recovery strategies, and real‑world case studies that demonstrate how MySQL and Redis services maintain continuous, consistent access with minimal RTO and RPO.

Instant Consumer Technology Team
Instant Consumer Technology Team
Instant Consumer Technology Team
Achieving High Availability for MySQL & Redis on MaShang Cloud with Distributed Sentinel

Introduction

MaShang Cloud provides relational database (RDS) products such as MySQL, TiDB, and PostgreSQL, offering transparent access through virtual IPs, EIPs, or domain names. This article focuses on the MySQL‑based high‑availability design, architecture, and comparisons with other cloud providers.

High‑Availability Architecture

Engineers designed a multi‑node distributed sentinel system that monitors both MySQL clusters and proxy instances, automatically detecting failures and performing failover without relying on simple ping checks.

Key components include:

DB Sentinel Cluster – monitors MySQL/RDS node health.

Proxy Sentinel Cluster – monitors proxy service health.

Synchronization Configuration Service – stores topology and role information in Zookeeper/etcd/Nacos.

Proxy Layer

The stateless proxy cluster handles user requests, routes traffic based on read/write roles, enforces access control, maintains connection pools, and can filter or throttle SQL statements. Common proxy software includes ProxySQL, DBProxy, MySQL Proxy, ArkProxy, and Sharding‑Sphere Proxy.

Sentinel Clusters

Both DB and Proxy sentinel clusters consist of 3‑5 nodes distributed across fault domains. They use a custom RAFT‑based consensus to determine node status (SDOWN, ODOWN) and trigger failover or switchover actions.

Each sentinel runs two internal coroutines:

Prober – performs health checks.

Failover – processes SDOWN/ODOWN events, elects a leader, and executes failover.

Additional coroutines manage endpoint refreshes and RPC handling.

High‑Availability Features

Supports all MySQL architectures (master‑slave, MGR, Galera, etc.) and versions 5.7/8.0.

Custom monitoring and failover logic reduces false positives.

Accurate failure detection via distributed consensus.

Redundant sentinel deployment tolerates up to half of the nodes failing.

Network and data‑center partition tolerance.

Zero‑intrusion for tenant applications.

Performance Metrics

RDS achieves RTO ≈ 30 seconds and RPO ≈ 0, meeting stringent availability requirements.

Data Access Middleware

A proxy layer sits between applications and databases, abstracting underlying architectures, managing read/write routing, enforcing permissions, and providing connection pooling and traffic control.

Use Cases

Examples include a lifestyle service platform using multi‑AZ read/write separation, and a bank employing proxy‑based read/write splitting for near‑zero‑loss failover.

Single‑AZ and Multi‑AZ Solutions

Single‑AZ designs combine RDS/Redis with distributed sentinel and dual‑node proxies. Multi‑AZ designs add cross‑AZ data replication via DTS, separate virtual IPs, and rapid failover to achieve 99.99% availability.

Applicability Beyond Cloud

The same high‑availability patterns can be applied to on‑premise IDC environments, offering low‑cost, non‑intrusive solutions for small‑to‑medium enterprises.

Conclusion

MaShang Cloud's self‑developed RDS/Redis high‑availability solution satisfies tenant requirements for reliability and disaster recovery, while being adaptable to external IDC deployments, providing a robust technical foundation for financial and internet enterprises.

High AvailabilityRedismysqlRDSDatabase ProxyDistributed Sentinel
Instant Consumer Technology Team
Written by

Instant Consumer Technology Team

Instant Consumer Technology Team

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.