HBase at JD.com: Architecture, Use Cases, and Evolution
This article explains how JD.com leverages the open‑source HBase database for massive, low‑latency data storage across various business lines, detailing its architecture, multi‑tenant isolation, disaster‑recovery mechanisms, and integration with Phoenix SQL for OLTP workloads.
With the rapid growth of digital information, traditional relational databases can no longer meet the demands of massive, distributed data storage, prompting the adoption of column‑oriented systems such as Google Bigtable and its open‑source counterpart HBase.
HBase, inspired by Bigtable, has become a core component of JD.com’s big‑data platform, supporting both online real‑time queries (e.g., merchant marketing, personalized recommendation, POP orders) and offline batch processing with millions of queries per second and tens of millions of writes per second.
JD.com’s HBase usage can be grouped into three typical scenarios: (1) ultra‑large‑scale millisecond‑level read workloads for analytics, (2) T+1 reporting and data storage during nightly batch windows, and (3) real‑time data ingestion and updates requiring millisecond‑level latency.
To ensure stability at this scale, JD’s HBase platform is divided into storage, kernel, middleware, user, and auxiliary layers, with separate HDFS and HBase deployments, container‑based scaling, hardware‑aware RegionServer tuning, and a suite of middleware services for disaster recovery, data governance, quota management, and multi‑language support.
For high availability, JD implements a smart master‑slave switch based on policy servers: the client periodically heartbeats to obtain cluster and switch policies, while the PolicyServer stores policies in MySQL and can be horizontally scaled; ServiceCenter provides a UI for administrators. In case of a master failure, automatic or manual failover ensures continuous service and data consistency.
Multi‑tenant resource isolation is achieved through physical grouping (HBase 2.0 region server groups) that partition clusters into dedicated resource pools, as well as quota and throttling mechanisms that limit request rates and storage usage at cluster, namespace, and table levels, preventing hotspot and overload issues.
To broaden query capabilities, JD integrates the open‑source Phoenix layer, enabling SQL‑style access to HBase. The Phoenix service has been enhanced with authentication, multi‑tenant support, performance optimizations, and a load‑balanced QueryServer fronted by Nginx.
Example Java code for accessing Phoenix:
import java.sql.*;
public class Demo {
public static void main(String[] args) throws Exception {
Connection conn = DriverManager.getConnection(
"jdbc:phoenix:thin:url=http://q.sql.jd.com:2001;serialization=PROTOBUF",
"wuyiran", "jdpassword");
PreparedStatement stmt = conn.prepareStatement("select count(*) from zsc.proxy");
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
System.out.println(rs.getString(1));
}
}
}In summary, JD.com’s HBase platform has evolved from a bare‑metal deployment to a mature, feature‑rich service offering multi‑active disaster recovery, fine‑grained resource isolation, and SQL access, providing a practical reference for building ultra‑large‑scale HBase clusters.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
