Databases 14 min read

Design and Implementation of a DB2 pureScale GDPC Dual‑Active Database Platform

The article analyzes the shortcomings of traditional disaster‑recovery methods, explains why DB2 pureScale GDPC was chosen for a dual‑active database solution, and provides detailed design guidelines covering site selection, arbitration node, network architecture, storage layout, resource sizing, client connectivity, and the solution’s advantages and limitations.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Design and Implementation of a DB2 pureScale GDPC Dual‑Active Database Platform

During the construction of a two‑site, three‑center architecture, three major issues were encountered with traditional disaster‑recovery techniques: long switchover time, high operational risk, and excessive cost. To address these, a dual‑active platform was proposed to reduce RTO, lower expenses, and minimize switch‑over risk.

The chosen solution is the DB2 pureScale GDPC (Geographically Dispersed PureScale Cluster) because it offers high availability, scalability, and application transparency, and is better supported by IBM compared to Oracle RAC for true peer‑to‑peer active‑active deployments.

Key design principles for the dual‑active platform include:

Generality – based on the LUW open platform and deployable on any vendor’s storage, servers, and OS.

Parity – both data‑center sites handle transactions equally without a primary/secondary distinction.

High availability – minimize intra‑city switchover time and ensure uninterrupted global service.

Maintainability – allow non‑disruptive configuration changes via rolling upgrades.

Migration friendliness – the platform is transparent to applications, requiring no code changes.

Stability – target five‑nines operational reliability.

The GDPC topology requires three sites: two active data‑center sites and one arbitration site. The active sites must be within 50 km (up to 70–80 km acceptable) and connected by reliable TCP/IP links with RDMA (RoCE or InfiniBand). Each site needs a dedicated SAN controller, mirrored LUNs, and GPFS for synchronous replication.

The arbitration site hosts a single non‑member host (often a VM) that provides arbitration disks (50–100 MB) for each shared file system; it does not require SAN access or RDMA.

Network design recommendations include redundant DWDM links between sites, dual‑NIC Ethernet with active‑standby bonding, redundant RoCE switches, and a separate private VLAN for GPFS heartbeat and data traffic.

Storage design uses GPFS to maintain two consistent copies of each file system across redundant storage groups, with arbitration disks holding only metadata.

Resource allocation guidelines suggest that member CPUs should exceed single‑node equivalents, CF CPUs correlate with RDMA NICs (1 NIC per 6–8 cores), and memory sizing should roughly double that of a comparable standalone database to accommodate additional lock lists and GPFS buffers.

Clients should use affinity‑based connections to preferred member nodes to avoid cross‑site latency, with automatic client‑side reconnection (ACR) handling node failures.

After three years of production use, the DB2 pureScale dual‑active platform has demonstrated high availability and maintainability, though some performance penalties remain due to distance‑induced latency and I/O overhead.

Remaining challenges include hotspot mitigation, further reduction of inter‑node communication latency, and improving concurrency handling to fully exploit the available bandwidth.

network architecturehigh availabilitydual activeDatabase DesignDB2GDPCpureScale
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.