Databases 29 min read

Alibaba’s Secrets to Scaling HBase for PB‑Level Big Data

This article explains how Alibaba built, customized, and operated a massive HBase platform—covering its architecture, high‑availability design, asynchronous and synchronous replication, multi‑link data flow, cost‑aware redundancy, cross‑cluster migration, performance optimizations, and future directions for the distributed NoSQL database.

Alibaba Cloud Developer

Mar 24, 2017

Alibaba’s Secrets to Scaling HBase for PB‑Level Big Data

Overview

HBase is an open‑source, non‑relational distributed database (NoSQL) modeled after Google’s BigTable, offering high reliability, performance, and scalability on commodity servers. Originally a Hadoop sub‑project, it became an Apache top‑level project in 2010 and is now widely adopted by companies such as Facebook, Yahoo, and Alibaba.

HBase at Alibaba

Since 2011 Alibaba has used HBase as the core storage for Taobao, Tmall, Ant Financial, Cainiao, Alibaba Cloud, and other services, handling hundreds of GB/s of read/write traffic during peak events like Double‑11. The team built a one‑stop big‑data storage service covering software, solutions, stability, and development support.

High‑Availability Construction

Alibaba measures availability with SLA (e.g., 99.99% uptime means less than 52.6 minutes of downtime per year). To achieve high availability, data is replicated across multiple data centers, requiring consistent cross‑site copies and fault‑tolerant designs.

Cluster Asynchronous Replication

From HBase 0.92 onward, Replication asynchronously pushes incremental data from a primary cluster to a backup cluster, enabling disaster recovery. Alibaba improved source‑side sending efficiency, target‑side sink efficiency, and added hotspot‑assistance, online configuration, and multi‑link support.

Multi‑Link Data Flow

Multiple data links allow tables to replicate to one or more destinations, enabling flexible data routing, visual topology, loop‑avoidance, and link isolation to prevent a single congested link from affecting others.

Data Consistency

While most production systems use asynchronous replication (eventual consistency), Alibaba also provides strong‑consistency options: (1) a strong‑consistent switch that pauses writes on the primary until all data is replicated, and (2) synchronous replication where writes succeed only after both primary and backup have persisted the data.

Redundancy and Cost

Redundant cross‑cluster copies improve availability but double storage costs. Alibaba explores reducing replica counts (e.g., from three to two) and leveraging cross‑cluster partition replication to maintain resilience while lowering expense.

Cross‑Cluster Partition Replication

A job‑based system splits a table’s RowKey range into sub‑tasks dispatched by the master to region servers, enabling fast, fault‑tolerant, and resumable data migration between clusters.

Multi‑Cluster Active‑Active Service

Beyond traditional active‑standby, Alibaba implements client‑side dual‑cluster access, cross‑deployment, and load‑balancing to fully utilize both clusters and reduce latency spikes.

Additional Performance Work

Key optimizations include an asynchronous API, prefix BloomFilter for Scan operations, HLog compression compatible with replication, and coprocessor‑based built‑in calculations (Count, Avg, Sum, etc.) that dramatically reduce I/O and improve throughput.

Future Development

Upcoming focus areas are GC pause reduction via custom memory maps, SQL‑style access with global secondary indexes, and containerized deployment (Docker) for agile operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Alibaba distributed database HBase replication

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.