Why Data Replication Matters: Architectures, Formats, and Consistency Models
This article explores the principles of data replication, comparing shared memory, shared disk, and non‑shared storage architectures, detailing replication formats, consistency challenges, and various replication strategies such as synchronous, asynchronous, semi‑synchronous, and majority‑based approaches, helping engineers choose the right trade‑offs.
Shared Storage vs. Non‑Shared Storage Architectures
When discussing replication, we first ask why database data should be distributed across multiple machines. A single‑machine shared‑memory architecture lets all processors access the same RAM and disks, but scaling cost grows super‑linearly and fault tolerance is limited.
Shared‑disk architectures, common in data warehouses, allow independent CPUs and memory to connect to a common disk over the network. They improve scalability and fault tolerance compared with shared memory, yet they still suffer from a single point of failure at the storage subsystem and incur lock contention.
Non‑shared architectures distribute data across machines, each with its own CPU, memory, and disk, communicating via the network. This eliminates the single point of failure and improves scalability, cost, and availability.
What Is Data Replication?
Replication means keeping identical data copies on multiple network‑connected machines. It provides geographic acceleration, fault tolerance, and read‑scale by distributing queries across replicas.
Challenges of Data Replication
Scaling data volume often requires sharding, which assumes each node can hold a full copy of the dataset.
Capturing data changes and propagating them to replicas (single‑leader, multi‑leader, leaderless).
Choosing synchronous or asynchronous communication and handling network latency.
Recovering from replica failures and ensuring consistency.
Replication Formats
Command/statement‑based replication (simple but can cause inconsistencies with nondeterministic functions).
Data‑based replication (copies actual data values; safe but high traffic).
Log‑based replication: write‑ahead log (WAL) or logical binary log, decoupling from storage engine.
File‑based replication (complex, consistent, high traffic).
Application‑level replication (triggers, stored procedures, or custom ETL pipelines).
Replication Methods
Synchronous replication – provides strong consistency but reduces availability.
Asynchronous replication – higher write performance and availability, but may read stale data.
Semi‑synchronous replication – a hybrid approach.
Majority (quorum) replication – ensures strong consistency and higher availability at the cost of write latency and complexity.
Summary
Data replication involves choosing an appropriate format and method based on safety, consistency, performance, and cost trade‑offs. Understanding the underlying storage architecture and consistency model is essential for designing reliable, scalable systems.
Xiaokun's Architecture Exploration Notes
10 years of backend architecture design | AI engineering infrastructure, storage architecture design, and performance optimization | Former senior developer at NetEase, Douyu, Inke, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.