Databases 11 min read

Database High Availability: HADR, HACMP, Data Replication, Storage DR, and DPF Solutions

This article provides a comprehensive overview of database high‑availability techniques—including DB2 HADR, HACMP clustering, SQL and Q replication, storage‑layer disaster recovery, and DPF considerations—explaining their features, suitable scenarios, and how they can be combined to achieve robust end‑to‑end resilience.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Database High Availability: HADR, HACMP, Data Replication, Storage DR, and DPF Solutions

Database high availability is a complex system engineering topic; this article introduces several fundamental technologies—HADR, HACMP, data replication, storage‑layer disaster recovery, and DPF high‑availability—discussing their applicable scenarios, technical characteristics, and how they can be combined to achieve end‑to‑end resilience from storage, network, system, database to application.

1. DB2 HADR – HADR (High Availability Disaster Recovery) originated from Informix HDR and is now a DB2‑level data‑copy mechanism. It uses a primary‑standby pair; before version 9.7 the standby was not readable, while later versions allow read‑only access to reduce primary load. For high‑bandwidth, low‑latency links, synchronous transfer is recommended; for longer distances (e.g., Beijing‑Shanghai) asynchronous or super‑asynchronous modes are preferable. HADR cannot be used with DPF and is limited to single‑partition databases, but its switchover is faster and more reliable than Oracle DataGuard. It also lacks built‑in compression, encryption, and heterogeneous‑DB replication, requiring third‑party SSH or VPN for security.

In remote disaster scenarios, DB2 versions prior to 9.7 often combined HADR with Q replication to achieve readable standbys; for zero‑tolerance and short switchover times, HADR remains the most reliable choice.

2. SQL Replication and Q Replication – SQL replication is suited for same‑LAN environments, while Q replication performs better over unreliable networks by using WebSphere MQ to buffer data. Q replication is frequently paired with HADR for remote disaster recovery (e.g., China Tobacco’s DR center) and has minimal performance impact because it analyzes transaction logs. It supports table‑level replication well for DB2, but Oracle support is limited; other databases have poorer support. InfoSphere CDC (formerly Data Mirror) and Oracle GoldenGate are mentioned as additional options, though CDC can be complex at the table‑dependency level.

3. HACMP – HACMP offers three clustering modes: Cascading (primary‑backup with priority‑based failover), Rotating (equal‑priority nodes with resource groups starting on the first available node), and Concurrent (no primary/backup, all nodes run the resource group simultaneously). Cascading saves cost when primary and backup hardware differ; Rotating suits high‑availability telecom services; Concurrent is ideal for large‑capacity sites and often integrates with Oracle RAC or parallel servers. HACMP alone does not provide database redundancy, so it is typically combined with HADR for full protection.

4. DPF High‑Availability Scheme – DPF itself lacks a built‑in HA solution, but multi‑node configurations can offer limited disaster tolerance if the catalog node remains up. Table access depends on the health of the partition nodes that host the table’s tablespace. If a non‑critical node fails, the database remains accessible; if a critical node or OS/network fails, recovery must rely on HACMP or manual intervention. Planning HA for critical nodes, regular tablespace backups, and monitoring are essential.

5. Storage‑Layer Disaster Recovery – Storage‑level DR includes disk mirroring and third‑party backup solutions. SRDF (Symmetrix Remote Data Facility) provides tiered backup and replication across long distances (up to several thousand kilometers) with synchronous, near‑synchronous, and asynchronous modes, supporting all major hosts and databases. For short distances, direct fiber is used; for longer spans, DWDM solutions from Huawei or Cisco extend the link. SRDF is costly, so organizations may combine tape copies for less critical data with SRDF for high‑transaction workloads. Veritas BMR enables OS‑level backup and hot‑backup agents for various databases.

6. Network, Power, and Institutional Aspects – High‑availability networks require redundant NICs, multiple subnets, and at least four switches and four storage hosts between production and DR sites, plus multi‑path fiber. Power redundancy includes UPS and generators. Robust institutional policies—permission control, error‑prevention procedures, emergency run‑books, and regular DR drills—are crucial, especially for large telecom operators and banks that follow strict regulatory guidelines.

The author, Sun Yang, a senior expert in telecom software and core network services, invites discussion and feedback to refine these insights.

DatabaseHigh Availabilitydisaster recoveryReplicationDB2HADR
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.