Databases 15 min read

Master MySQL Replication, Sharding, and Distributed Deployment in 10 Minutes

This article provides a concise, ten‑minute guide to MySQL master‑slave and master‑master replication, data sharding principles and implementations, and various database deployment architectures—including single‑instance, replication‑based scaling, and sharding‑based scaling—while highlighting practical considerations, advantages, and common pitfalls.

ITPUB

Jun 22, 2019

Master MySQL Replication, Sharding, and Distributed Deployment in 10 Minutes

1. MySQL Replication

MySQL master‑slave replication copies data from a primary server to one or more secondary servers, enabling read/write separation: write operations go to the primary, while read operations are served by the replicas, increasing read throughput.

The replication process works by logging updates to the primary's Binlog, which a replication thread reads and transmits to each replica. The replica writes the events to its Relay log and a SQL thread re‑executes them, keeping the data synchronized.

Master‑master (dual‑primary) replication extends this model: two servers act as primaries, each replicating its writes to the other, providing higher write availability.

Advantages of a one‑primary‑multiple‑replica setup :

Load distribution : read traffic is spread across several replicas.

Dedicated resources : specific replicas can be tuned for particular query types.

Cold backup : replicas can be taken offline for backup without affecting the primary.

High availability : failure of a single replica has minimal impact on the overall system.

Replication considerations :

Avoid writing to both primary and replica simultaneously to prevent conflicts.

Replication improves read concurrency but does not increase write capacity or total storage.

Schema changes generate large synchronization delays; they should be performed carefully, often by a DBA.

In a failover scenario, the application detects primary failure, redirects writes to the surviving primary, and later resynchronizes the recovered node as a replica.

2. Data Sharding

When a single database cannot handle write load or data size, sharding splits a table into smaller pieces stored on multiple servers, reducing storage pressure and improving write scalability.

Key goals : divide a large table into shards, each residing on a different server, and route queries based on a shard key.

Characteristics :

Servers are independent; failure of one does not affect others.

Routing is performed via the shard key, allowing the application to contact only the relevant server.

Hard‑coded sharding example : using user ID parity (odd IDs to server 2, even IDs to server 1). The application calculates the remainder and connects to the appropriate server, but this couples business logic with sharding rules, making maintenance harder.

External mapping table : store the shard‑to‑server mapping outside the application, allowing dynamic lookup of the target server based on the shard key.

Specialized middleware such as Mycat can manage sharding transparently. For example, with three shard servers (dn1, dn2, dn3) and a rule based on the prov field, a query like SELECT * FROM orders WHERE prov='wuhan' is automatically routed to the correct node.

When data volume grows, additional shard servers can be added, but the routing rules must be updated and existing data may need to be migrated—a complex operation often mitigated by logical database abstractions.

3. Database Deployment Schemes

Single service with a single database : multiple application servers share one database; suitable for low‑traffic early stages.

Master‑slave scaling : introduce one primary and multiple replicas to separate write and read workloads, achieving basic horizontal scaling.

Two web services with two databases (business splitting) : separate functional domains (e.g., product catalog vs. user data) into distinct databases, each possibly using master‑slave replication, reducing coupling and improving performance.

Comprehensive scheme : combine sharding and replication—critical tables are sharded across multiple servers, each shard employing master‑slave replication for high availability.

For workloads that do not require relational features, NoSQL databases can provide higher storage capacity and concurrency, though they introduce consistency challenges governed by the CAP theorem, which can be mitigated with techniques such as timestamp merging, client‑side conflict resolution, or voting protocols.

Source: 《阿里前辈的架构经》第04讲 – Distributed Data Storage, presented by Li Zhihui, former Alibaba technical expert.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems high availability mysql Replication Mycat

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.