Advanced SpringBoot Read‑Write Splitting: Master‑Slave Switching and Automatic Failover
In high‑concurrency internet architectures, a MySQL master‑slave setup with read‑write splitting is the baseline for high availability, but static routing suffers from node failures and lag; this article explains how ShardingSphere provides health checks, auto‑failover, load‑balancing, and degradation to achieve resilient read‑write separation.
Read‑Write Splitting Architecture
In high‑traffic internet systems, a MySQL master‑slave topology with read‑write splitting provides basic high‑availability. The master processes all write statements, while one or more slaves handle read traffic, reducing write‑read contention, connection limits, and CPU saturation.
If a slave crashes, requests continue to be routed to the failed node, causing query timeouts, connection errors, circuit breaking, and widespread 500 responses.
If the master fails, the entire system cannot perform writes, resulting in service outage.
Replication lag can return stale data, breaking consistency.
Static routing lacks node health detection, automatic exclusion of faulty nodes, and automatic reintegration of recovered nodes.
Uneven load across multiple slaves can create a single‑point overload avalanche.
Basic Static Read‑Write Splitting Defects
Write requests always go to the master.
Read requests are distributed by fixed round‑robin or random selection among slave1 and slave2.
No node liveness detection.
No automatic removal of faulty nodes.
No automatic reintegration of recovered nodes.
Core Requirements for High‑Availability Read‑Write Splitting
Node health probing: periodic heartbeat checks for connectivity and availability.
Automatic slave failover: faulty slaves are excluded and traffic is smoothly shifted to healthy nodes.
Node auto‑recovery: after a failed node restarts and passes health checks, it is automatically re‑added to the cluster.
Global degradation fallback: if all slaves are down, reads are automatically routed to the master.
Master failover: supports manual emergency switch and can be combined with clustering for automatic master election.
Why Choose ShardingSphere?
Native support for health detection, failover, load balancing, and degradation fallback.
Zero code intrusion; all features are enabled via configuration.
Dynamic node management, multiple load‑balancing strategies, and compatibility with any SpringBoot project.
Replaces traditional dynamic data‑source frameworks with stronger stability and richer ecosystem.
Environment Setup (1 Master + 2 Slaves)
Topology
Master (3306): write, transaction, update, delete.
Slave1 (3307): read traffic.
Slave2 (3308): read traffic.
Ensure MySQL master‑slave replication is configured and binlog_format=ROW is enabled.
Maven Dependencies
<!-- SpringBoot Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- ShardingSphere read‑write splitting core -->
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>shardingsphere-jdbc-core-spring-boot-starter</artifactId>
<version>5.2.1</version>
</dependency>
<!-- Druid connection pool -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.2.16</version>
</dependency>
<!-- MySQL driver -->
<dependency>
<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<scope>runtime</scope>
</dependency>Health Check and Automatic Slave Failover
ShardingSphere 5.x provides a built‑in database heartbeat mechanism that performs node probing, fault exclusion, traffic shifting, and node recovery without custom detection threads.
Read‑Write Splitting Configuration
spring:
shardingsphere:
datasource:
names: master,slave1,slave2
master:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://127.0.0.1:3306/test_db?useUnicode=true&serverTimezone=Asia/Shanghai&allowMultiQueries=true
username: root
password: root
slave1:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://127.0.0.1:3307/test_db?useUnicode=true&serverTimezone=Asia/Shanghai
username: root
password: root
slave2:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://127.0.0.1:3308/test_db?useUnicode=true&serverTimezone=Asia/Shanghai
username: root
password: root
rules:
readwrite-splitting:
data-sources:
master-slave-group:
write-data-source-name: master
read-data-source-names: [slave1, slave2]
load-balancer-name: round_robin
read-data-source-query-enabled: true
load-balancers:
round_robin:
type: ROUND_ROBIN
random:
type: RANDOM
weight:
type: WEIGHT
props:
slave1: 30
slave2: 70
health-check:
enabled: true
health-check-sql: SELECT 1
interval: 10000 # 10 seconds
timeout: 3000 # 3 seconds
failure-threshold: 3 # mark node down after 3 consecutive failures
recovery-threshold: 2 # bring node back after 2 consecutive successes
props:
sql-show: trueCore Parameters
Health‑check interval: every 10s execute SELECT 1 to verify connectivity.
Failure exclusion: after 3 consecutive failures, the node is marked unavailable and removed from the load‑balancing pool.
Automatic recovery: after 2 consecutive successful checks, the node is automatically re‑added.
Global degradation fallback: read-data-source-query-enabled: true routes reads to the master when all slaves are down.
Complete Failover Process
Slave1 crashes or loses network connectivity.
Heartbeat checks fail consecutively; ShardingSphere marks Slave1 as offline.
Subsequent read traffic is routed only to Slave2, keeping the service uninterrupted.
Slave1 restarts and passes health checks.
After two successful probes, the node is automatically re‑added and read traffic is re‑balanced.
Multi‑Slave Load‑Balancing Strategies
ShardingSphere offers three built‑in policies:
ROUND_ROBIN (default)
Evenly distributes traffic across all slaves; suitable when hardware and performance are uniform.
RANDOM
Selects a slave at random; provides relatively even distribution for small clusters.
WEIGHT
Assigns weights based on server capacity; higher‑spec machines handle more traffic. Example: slave1: 30, slave2: 70 results in 70% of read requests going to slave2.
Master‑Slave Replication Lag Solutions
The primary consistency risk is that the master commits a write while a slave has not yet synchronized, causing stale reads.
Solution Options
Force master for real‑time queries: define a custom annotation and AOP aspect to route critical reads to the master.
Transactional master routing: ShardingSphere defaults to using the master for all reads and writes within the same transaction, eliminating intra‑transaction lag.
Delay‑threshold routing: monitor slave lag; if it exceeds a threshold, automatically exclude the slave and route reads to the master.
Business‑level eventual consistency: tolerate short delays for non‑critical queries while forcing critical queries to the master, achieving a cost‑effective production solution.
Annotation Example for Master‑Only Queries
// Annotation to force master routing
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface MasterQuery {}Combine this annotation with an AOP aspect and ShardingSphere's dynamic routing to enforce master reads.
Master Failover Strategies
While slaves can failover automatically, the master remains a single point of failure.
Manual Master‑Slave Switch
When the master crashes, operators promote a slave to master and restart services with updated configuration. This approach is simple and reliable but requires manual intervention.
Automatic Master Failover
Integrate MySQL Group Replication (MGR), Keepalived, or ZooKeeper to achieve:
Automatic master election upon failure.
Virtual IP (VIP) auto‑migration.
Zero‑code configuration changes for seamless switch.
Suitable for payment, transaction, and order systems that demand high availability.
Conclusion
Effective read‑write splitting goes beyond static routing; it requires fault‑tolerant self‑healing mechanisms such as health checks, automatic failover, node recovery, load balancing, and degradation fallback. ShardingSphere implements all of these mechanisms out‑of‑the‑box, providing a robust solution for high‑concurrency production environments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Tech Workshop
Focused on Java backend technologies, sharing fundamentals, multithreading, JVM, the Spring ecosystem, microservices, distributed systems, high concurrency, source‑code analysis, and practical experience. Continuously delivers high‑quality original content, interview guides, and learning roadmaps to help Java developers progress from beginner to advanced, enhancing technical skills and core competitiveness.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
