Databases 15 min read

JDHBase Multi‑Active Disaster Recovery: Replication, Auto‑Failover & Consistency

JDHBase, JD.com’s large‑scale KV store, powers billions of daily reads and writes across 7,000 nodes, and this article details its multi‑active, cross‑region architecture—including HBase replication fundamentals, Fox Manager routing, automatic failover policies, dynamic replication tuning, and serial replication to ensure strong consistency.

dbaplus Community
dbaplus Community
dbaplus Community
JDHBase Multi‑Active Disaster Recovery: Replication, Auto‑Failover & Consistency

Introduction

JDHBase is JD.com’s online key‑value storage system that handles massive online traffic, reaching trillion‑level read/write requests during major sales events such as 11.11 and 6.18. The cluster now exceeds 7,000 nodes with a storage capacity of 90 PB, serving more than 700 business scenarios including orders, recommendations, user profiling, finance, logistics, and monitoring.

To guarantee uninterrupted service, JDHBase implements a geographically distributed multi‑active system. The following sections describe the design and practice of this system.

HBase Replication Basics

HBase uses a Log‑Structured Merge‑Tree (LSM) architecture. Write requests are first stored in an in‑memory Memstore and appended to a sequential Write‑Ahead Log (WAL) on HDFS, ensuring durability even after node failures.

Replication in HBase is WAL‑based. Each RegionServer runs a ReplicationSource thread that reads WAL entries, applies configured filters, and sends them via replicateWALEntry RPC to the backup cluster. The backup’s ReplicationSink thread converts received entries into put/delete operations and writes them in batches.

This asynchronous replication typically introduces a latency of a few seconds before the standby cluster receives the latest writes.

JDHBase Multi‑Active Architecture

The system consists of three main components: the Client, the JDHBase cluster, and the Fox Manager.

1. Fox Manager Configuration Center

Fox Manager maintains user and cluster metadata, providing configuration services to clients and administrators.

Policy Server – a stateless distributed service that handles external requests; persistence can be MySQL or Zookeeper; optionally includes a Rule Engine for automatic configuration adjustments based on cluster state.

Service Center – UI for administrators to manage configurations.

VIP Load Balance – presents a unified access address for Policy Servers and provides load balancing.

2. JDHBase Cluster

The cluster offers high‑throughput OLTP capabilities. For critical workloads, an active‑standby multi‑active setup is deployed.

Active Cluster – normal business runs here; data is asynchronously replicated to the standby cluster with second‑level delay.

Standby Cluster – activated when the active cluster experiences failures; it also asynchronously replicates its data back to the active side.

Replication between the two clusters ensures eventual consistency. In production, a multi‑cluster replication topology is built so that each cluster can act as both primary and backup for different services, forming a mesh of replication relationships.

3. Client

When a client starts, it first contacts Fox Manager to report user information. After authentication, Fox Manager returns the appropriate cluster connection details. The client then creates an HConnection to interact with the designated cluster.

Clients also receive configuration parameters (e.g., retry counts, timeouts) from Fox Manager, which can be adjusted per business needs or in extreme scenarios.

Metrics are embedded in the client SDK to monitor service availability from the client’s perspective. Heartbeats report client status and metrics back to Fox Manager, enabling real‑time configuration updates and automatic client‑side failover when availability drops.

Cluster Switching

Data routing in HBase involves three steps: (1) the client obtains the Zookeeper address of the target cluster and retrieves the META table location; (2) it queries the META table to get region information and caches it; (3) it accesses the appropriate region server for data operations.

JDHBase adds an extra step where the client first contacts Fox Manager for user authentication, cluster information, and client parameters. This allows seamless switching between active and standby clusters without client‑side changes.

Automatic Failover

Manual failover can take minutes, which is unacceptable for some online services. JDHBase therefore implements policy‑driven automatic failover that can react within seconds.

HMaster runs a status‑checking plugin that collects availability metrics and reports them via heartbeat to the Policy Server. The Policy Server stores policies in MySQL and can be horizontally scaled.

A Rule Engine, built on Raft for high availability, evaluates these metrics across multiple time windows and triggers failover actions when thresholds are breached. The engine can also apply temporary parameters for extreme cases.

Dynamic Parameters & Automatic Throttling

Replication can become a bottleneck under heavy write loads or hotspot traffic, leading to backlog. JDHBase exposes dynamic replication parameters that can be updated at runtime without restarting RegionServers.

When backlog exceeds defined thresholds, the system automatically adjusts replication parameters (typically 1‑2× the baseline) to accelerate data transfer. Additionally, Fox Manager can schedule bandwidth‑aware throttling during peak periods to avoid saturating inter‑data‑center links.

Serial Replication

Standard asynchronous replication may produce out‑of‑order writes when RegionServers move or fail, potentially causing permanent inconsistencies (e.g., a Delete arriving before the corresponding Put).

Serial Replication guarantees that the order of mutations in the primary cluster matches the order applied in the standby cluster. It uses Barriers and lastPushedSequenceId stored in Zookeeper. When a Region opens, a new Barrier (max SequenceId + 1) is recorded. During replication, each batch updates lastPushedSequenceId. A RegionServer will only push data if its lastPushedSequenceId is greater than or equal to the preceding Barrier, ensuring ordered delivery.

Summary and Outlook

Through continuous adoption of industry disaster‑recovery practices and internal innovations, JDHBase now achieves a 99.98 % SLA, providing stable and reliable storage for JD.com’s massive online services. Future work will focus on synchronous replication, eliminating Zookeeper dependencies, client‑side automatic switching, and reducing data redundancy to further improve reliability and efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityDatabase Architecturedisaster recoveryHBaseReplicationmulti-active
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.