Unlocking Dual‑Active Storage: Inside HDS’s HAM and GAD High‑Availability Architecture
This article explains HDS’s High Availability Manager (HAM) and Global Active Device (GAD) technologies, detailing how they virtualize mirrored LUNs, use TrueCopy replication, arbitration mechanisms, and various network topologies to provide seamless active‑active storage, support for NAS, and flexible clustering across data centers.
HAM: High Availability Manager Overview
HAM, released by HDS in 2009 as the first dual‑active feature for VSP storage, lets a host treat a pair of mirrored LUNs as a single LUN. It manages the master‑slave relationship and performs automatic role switching to keep the LUN continuously available to applications.
TrueCopy Synchronization and LUN Virtualization
TrueCopy synchronously replicates a primary LUN (Pvol) and its secondary LUN (Svol). HAM virtualizes the Pvol/Svol pair into a single LUN whose ID, WWN, and serial number are identical, then maps this virtual LUN to the host.
For a host connected to both the primary and secondary arrays, the path to the primary array is the Owner Path (MCU) and is read‑write. The path to the secondary array is the Non‑Owner Path (RCU) and is read‑only.
Failover Process
Under normal operation, I/O is sent through the Owner Path to the Pvol. If the primary array or link fails, HAM detects the loss, promotes the Svol to writable status, upgrades the Non‑Owner Path to Owner Path, and the host’s multipathing software redirects I/O to the new Owner Path, ensuring uninterrupted service.
Limitations of HAM and Introduction of GAD
HAM operates in an Active‑Passive mode and does not support dual‑write on the arrays nor NAS dual‑active. To overcome these limits, HDS introduced Global Active Device (GAD), an Active‑Active solution that works with HNAS to provide NAS dual‑active capabilities.
GAD Architecture and Quorum Mechanism
GAD forms a cluster of two arrays, allowing both sites to read and write simultaneously. Data is kept consistent via VSP G1000’s TrueCopy synchronous replication. To avoid split‑brain, a quorum mechanism with up to 32 arbitration disks is used; the site that retains quorum continues serving I/O.
Supported Platforms and Configuration
Initially available on high‑end VSP G1000, GAD later extended to G200, G400, G600, G800, and the 2016‑released G1500/F1500 models. Updates can be applied via local server disks for arbitration.
Active‑Active Features
All I/O writes first go to the primary LUN, then to the secondary LUN. The configuration uses the native HDLM multipathing, supports local‑preferred read/write policies, up to 100 km inter‑site distance, FC/IP replication links, eight physical paths, and cross‑zoned array‑host networking.
Virtual Storage Machine (VSM) Scaling
Within a single physical storage system, users can define multiple VSMs, each with its own storage ID, serial number, and WWN, improving resource utilization and flexibility. GAD leverages VSM functionality to horizontally scale VSP G1000 and supports up to eight VSMs, enabling up to 63 231 dual‑active GAD volumes.
Virtual Controller (VDKC) and LUN Identification
GAD uses a virtual controller (VDKC) that presents a unified controller ID to hosts, regardless of underlying physical changes. Hosts identify LUNs via a shared virtual serial number (SN), and the primary and secondary LUNs retain identical LDEV numbers.
Network Topology Options
GAD supports several deployment models:
Single‑host dual‑array: provides storage‑level dual‑active but no application‑level failover.
Dual‑host dual‑array: requires clustering software on servers to achieve full business continuity across both storage and compute layers.
Cross‑network (cross‑zoned) topology: recommended approach where servers see all storage, and both clustering and multipathing software cooperate for graceful failover.
Distributed cluster: combines storage and compute redundancy, also relying on a quorum mechanism to prevent split‑brain.
Arbitration Mechanisms
Three common quorum implementations are described:
Priority‑site: selects one site as primary; simple but risky if the priority site fails.
Software arbitration: runs dedicated arbitration software on a third site (physical server, VM, or public cloud). PureStorage’s ActiveCluster uses an OVF‑deployed software arbiter.
Array‑based arbitration disk: places an additional array in a third site to host arbitration disks; this is the method used by GAD.
NAS Dual‑Active via HNAS
HDS pairs GAD with the HNAS gateway to deliver NAS dual‑active. HNAS forms a stretched two‑node cluster; write I/O first lands in the primary node’s NVRAM, is mirrored to the secondary node’s NVRAM, then acknowledged to the client. Periodically, NVRAM data is flushed to the GAD SAN dual‑active volume. HNAS uses separate 10 GbE links for data replication, heartbeat, and management, and employs distinct arbitration systems for SAN and NAS.
Benefits and Drawbacks
When storage pools are underutilized, the VSP can spin down or sleep idle HDDs, reducing power consumption. GAD also supports integration of heterogeneous third‑party arrays. However, NAS dual‑active introduces separate replication, heartbeat, and management networks, increasing configuration complexity and fault‑handling difficulty.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
