Disaster Recovery Explained: Definitions, Strategies, and Implementation
This article provides a comprehensive guide to disaster recovery, covering its definition, the distinction between backup and DR, various protection strategies, measurement metrics such as RPO and RTO, and practical implementation methods across storage, cloud, and network layers.
1. Definition of Disaster Recovery
1.1 What is Disaster Recovery?
Disaster recovery (DR) refers to the use of existing scientific and technical means to establish reliable emergency procedures in advance to cope with sudden incidents.
DR includes both backup systems and disaster‑recovery systems.
1.2 Backup and Disaster‑Recovery Concepts
Backup
Backup: Ensuring data safety . It is the process of copying all or part of a data set from production disks or arrays to other storage media.
Disaster Recovery
Disaster Recovery: Ensuring business continuity . It builds one or more remote IT systems that contain complete infrastructure (compute, network, storage, power, cooling). When the primary data center fails, the secondary site can quickly restore services.
Differences
Protection objects: Backup protects data; disaster recovery protects business continuity.
Implementation: Backup uses backup software; disaster recovery uses replication or mirroring software.
Time window: Replication/mirroring has a much shorter protection cycle than backup.
2. Role of Disaster Recovery
2.1 Problems in Data Centers
Viruses, OS vulnerabilities
Human error
Terrorist attacks
Power failures
Hardware failures
Natural disasters (earthquake, flood, typhoon)
2.2 Consequences of No DR
Business interruption
Data loss
Customer complaints
Revenue decline
Compensation costs
Company bankruptcy
2.3 Backup Functions
Storage Layer
Backup consists of five parts: backup client, storage strategy (media, deduplication, retention, write I/O), backup content (what to back up, what to exclude), backup policy (deduplication, type, schedule), and performance optimization (client read streams).
Cloud Computing Layer
Cloud Server Backup Service (CSBS) provides whole‑machine backup using consistent snapshot technology and remote replication. Volume Backup Service (VBS) creates backups of cloud disks and can roll back data.
Replication Types
Synchronous replication copies data in real time; asynchronous replication copies data with possible consistency lag.
2.4 Disaster‑Recovery Scenarios
Local high‑availability (HA)
Active‑standby (AS)
Active‑active (AA) data centers
Two‑site three‑center (3DC) solutions
Local HA
Uses real‑time mirroring and synchronous replication within the same campus; typically targets RPO≈0.
Active‑Standby
Provides near‑zero RPO and low TCO; supports automated failover and recovery.
Active‑Active
Six‑layer active‑active architecture; aims for zero business interruption and zero data loss.
Two‑Site Three‑Center
Offers both cascade and parallel networking options, each with its own advantages and performance requirements.
3. Measuring Disaster Recovery
3.1 Backup Types
Full backup – complete copy of all data at a point in time.
Cumulative incremental backup – incremental changes since the last full backup.
Differential incremental backup – changes since the last backup of any type.
3.2 Backup Strategy Principles
Combine full backups with either cumulative or differential incremental backups, but avoid mixing both incremental methods in the same policy.
Choose the combination based on space and backup‑window constraints.
3.3 DR Metrics
Recovery Point Objective (RPO): Maximum tolerable data loss measured in time.
Recovery Time Objective (RTO): Maximum tolerable business downtime measured in time.
Standard levels (GB/T 20988‑2007) map RPO/RTO to recovery‑ability grades from 1 (days) to 6 (minutes).
4. Implementing Disaster Recovery
4.1 Backup Methods
LAN‑Base – backup agents on production servers send data over LAN to a backup server.
LAN‑Free – data bypasses LAN, flowing directly from file servers through FC switches to tape.
Server‑Free – data is copied directly from storage to tape without passing through the server’s CPU, memory, or bus.
NDMP – network data management protocol enables storage devices to send data directly to backup targets.
4.2 Backup Media
Disk arrays
Tape libraries
Virtual tape libraries
Optical libraries
Cloud storage
Integrated appliances (e.g., Huawei HDP3500E)
4.3 Design Principles
Customer requirements (data type, volume, objects).
Backup policy (frequency, schedule).
Network planning (bandwidth, topology).
Storage planning (capacity, growth).
4.4 Disaster‑Recovery Techniques
Host‑level replication – software installed on servers replicates data.
Network‑level replication – similar software uses network links for replication.
Storage‑level replication – dedicated storage systems replicate data over fiber or WAN, minimizing impact on servers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
