Overview of Five Common Data Replication Technologies
This article introduces the global data replication market, explains synchronous and asynchronous replication, and details five typical replication techniques—host‑based, application/middleware‑based, database‑based, storage‑gateway‑based, and storage‑media‑based—highlighting their principles, advantages, and trade‑offs for disaster‑recovery planning.
According to IDC data, the global data replication market exceeded $50 billion in 2018, with backup and recovery software representing a significant portion. Replication copies data from a source to one or more targets and can be classified as synchronous or asynchronous.
Synchronous replication requires each write operation to be completed on both source and target before proceeding, offering low data loss but potentially impacting production performance unless the target is geographically close. Asynchronous replication does not wait for the target to receive data, resulting in a time lag but minimal impact on production systems.
The choice of replication technique directly influences Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Five common approaches are discussed:
1. Host‑based replication uses disk‑level mirroring or copying, operating at the host’s volume manager layer. It is flexible regarding storage hardware, provides reliable IP‑based data transfer, and enables rapid recovery, though it consumes CPU resources and may lack snapshot capabilities.
2. Application/Middleware‑based replication performs data writes at the application layer, allowing dual‑write or multi‑write to achieve replication across multiple copies. While offering customization and independence from underlying OS, database, or storage, it is complex to implement and maintain, increasing application risk and maintenance cost.
3. Database‑based replication includes logical and physical methods. Logical replication uses redo or archive logs to apply SQL changes asynchronously, providing eventual consistency. Physical replication copies redo logs or archive logs directly, supporting synchronous or asynchronous persistence and read‑only standby nodes. Advanced log‑analysis techniques enable real‑time, multi‑site active‑active databases.
4. Storage‑gateway‑based replication places a gateway between servers and storage, often on a SAN network, virtualizing storage resources. It intercepts I/O streams to offer services such as remote replication, heterogeneous storage consolidation, snapshots, and continuous data protection, benefiting heterogeneous environments with bandwidth optimization and fine‑grained recovery.
5. Storage‑media‑based replication leverages built‑in firmware or OS of storage systems, using IP or Fibre Channel to replicate data synchronously or asynchronously. It can operate in one‑to‑one, one‑to‑many, or many‑to‑one modes, often requiring identical storage models and low‑latency, high‑bandwidth links.
In practice, no single technique is universally superior; organizations must select the method that best fits their specific business scenarios and constraints.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.