Huawei VIS and HyperMetro Dual-Active Storage Solutions Overview
This article provides a detailed overview of Huawei's dual-active storage solutions, describing the VIS gateway architecture, stretched VCS clusters, HyperMetro active-active implementation, network and hardware considerations, replication mechanisms, distributed locking, and performance optimizations for long-distance data-center deployments.
Huawei's previous generation dual‑active solution is implemented through the VIS (Virtual Intelligent Storage) product, which functions similarly to EMC's SVC gateway. VIS uses heterogeneous virtualization technology to integrate different IP SAN and FC SAN storage resources and achieve active‑active operation across sites. It supports clusters of 2‑16 nodes, with each node operating in an active‑active mode, thereby mitigating reliability issues caused by IO groups. The dual‑active architecture relies on stretched VCS clusters and volume mirroring.
The VIS software architecture (Storage Foundation) includes components such as VxVM and VxDMP, which are common Linux volume manager and multipathing software. The heterogeneous virtualization functions as plug‑in modules that support almost all mainstream enterprise storage devices.
VIS hardware is built on Huawei's self‑developed Pangu platform. The dual‑active link consists of heartbeat, IP‑based host links, and dual‑active FC data links. For nearby data centers, direct switch connections can be used.
When the two data centers are far apart, DWDM equipment is required for dispersion compensation and regeneration. The optical fibers for the IP host link and the dual‑active FC data link can be reused.
In practice, VIS supports up to 100 km distance for Oracle databases and up to 300 km for VMware virtual machines (the 100 km test shows latency under 1.3 ms). The achievable distance depends on network conditions and the timeout/sensitivity of the upper‑layer applications.
VCS clusters typically use three local disks, but in a stretched VCS dual‑active scenario a third‑site storage is required, connected via iSCSI or FC.
VIS volume mirroring works like SVC: LUNs from both sites are virtualized and presented to hosts as a single device. Write I/O is duplicated to the caches of both VIS nodes before acknowledging the host, ensuring data consistency. Read I/O can be configured in round‑robin or priority mode; round‑robin balances load for non‑sequential I/O, while priority mode reads from the designated preferred mirror.
Each mirrored volume in a data center has an associated differential bitmap. When a storage or link failure occurs, the bitmap records the data differences; after recovery, the system uses the bitmap to automatically synchronize the divergent data.
Because all VIS nodes operate in active‑active mode, if a VIS node in data center A fails, hosts can continue writing to the VIS node in data center B without any failover procedure. If storage in data center A fails, its VIS node can write directly to the storage in data center B, eliminating the need for data forwarding or node switching.
VIS and SVC gateway dual‑active solutions can mask underlying storage devices, allowing legacy systems that lack native dual‑active capabilities to benefit from active‑active storage, but they also introduce additional hardware cost, more network failure points, increased network overhead, and latency.
Huawei's latest dual‑active offering, HyperMetro, moves from a gateway‑based approach to a self‑developed solution built on the V3/Dorado V3 storage arrays.
HyperMetro creates an active‑active cluster of two independent storage arrays. UltraPath multipathing aggregates the LUNs from both arrays into a single virtual dual‑active LUN, providing real‑time synchronization and simultaneous read/write capability.
When an application server issues I/O, read requests are served directly from the local cache, while write requests first acquire a distributed lock (using Paxos and CHT algorithms), then write concurrently to the local and remote LUN caches; the write is acknowledged only after both sides succeed.
The distributed lock module uses Paxos and Consistent Hash Table (CHT) algorithms to provide object‑level and range‑level locks, enabling active‑active concurrency control at the host I/O granularity and reducing unnecessary inter‑array data transfer.
If the replication link between sites fails during a write, the system logs the address differences. After the fault is cleared, the logged differences are used to synchronize the data.
1. Acquire write permission and log the write: The array in data center A receives a host write request, obtains the dual‑active pair's write lock, and records the write address in a power‑loss‑protected log.
2. Perform dual write: The request is duplicated and written to both the local LUN cache and the remote LUN cache.
3. Process dual‑write results: The system waits for acknowledgments from both LUNs.
4. Respond to host: The dual‑active pair returns a successful write completion to the host.
The typical HyperMetro network architecture connects the two storage arrays via FC or IP links, with FC being the recommended choice. The arrays connect to arbitration servers over standard IP links.
HyperMetro supports arbitration per dual‑active pair or per consistency group, offering static priority mode and an arbitration server mode. The static priority mode is generally discouraged; a dedicated physical or virtual server should be placed in a third site to avoid a single‑site disaster.
Network optimizations include FastWrite, which merges the SCSI "write command" and "write data" phases into a single transmission and removes the "write allocation" step, as well as the SCSI First Burst Enabled feature that halves the number of data‑transfer exchanges.
Distance between sites impacts I/O performance. HyperMetro with UltraPath provides two I/O access strategies: a priority‑array mode for different‑site deployments and a load‑balancing mode for same‑site deployments, where I/O is striped across the two arrays.
Load‑balancing mode splits I/O into fragments (e.g., 128 MiB chunks) and distributes them between arrays A and B based on address ranges.
In summary, HyperMetro offers a storage‑level active‑active solution for both SAN and NAS workloads, supporting heterogeneous virtualization, but it requires dedicated UltraPath multipathing software and is less flexible than gateway‑based solutions like SVC.
Storage dual‑active can be implemented as Active‑Passive or Active‑Active. Active‑Active primarily ensures that a single LUN can be read and written simultaneously from two sites, which also requires application‑level clustering. From the storage system perspective, when different workloads run across sites, each site may act as Active‑Passive for the other's data, yet the overall system still presents an "Active‑Active" appearance.
Warm tip: Scan the QR code below or search "ICT_Architect" to follow the public account for more content.
People who like and share are said to reach the pinnacle of life.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.