Big Data 24 min read

HBase Cluster Deployment Architecture, Configuration Optimization, and Application Layer Usage

This article details the evolution of HBase cluster deployment from mixed‑hardware/software setups to fully independent clusters, explains hardware and software considerations, presents memory and region planning, outlines key configuration parameters, and provides Spark integration examples for batch and real‑time queries and writes.

Big Data Technology & Architecture

Apr 1, 2020

Cluster Deployment Architecture

HBase is used for large‑scale data, high concurrency, millisecond‑level OLTP real‑time systems.

Stage 1: Mixed Hardware + Mixed Software Cluster

Cluster size: 20 nodes

Deployed services: HBase, Spark, Hive, Impala, Kafka, Zookeeper, Flume, HDFS, Yarn, etc.

Hardware: heterogeneous memory, CPU, and disk configurations (high‑end mixed with low‑end)

Hardware mixing means machines have varied specs; software mixing means the whole CDH stack is installed. Such a cluster works for offline or cold‑data storage, but for a real‑time, high‑concurrency HBase service it quickly becomes unstable as usage grows.

Stage 2: New Hardware + Mixed Software Cluster

Cluster size: 30 nodes (later expanded to 40)

Deployed services: same as Stage 1

Hardware: all nodes have high‑end, uniform memory, CPU, and disk

Although hardware is upgraded, the mixed‑software model still causes problems: heavy offline tasks generate I/O >4 GB/s, network I/O >8 GB/s, or HDFS I/O >5 GB/s, leading to latency or RegionServer crashes. Separating HBase earlier would have avoided these issues.

Stage 3: Independent Hardware & Software HBase Cluster

Cluster size: 15 RegionServers + 5 Zookeeper nodes

Deployed services: HBase, HDFS (5 virtual Zookeeper nodes)

Hardware: high‑end physical machines (virtual machines only for Zookeeper)

This design isolates HBase from the impact of other services. Zookeeper is recommended to run on 5 separate nodes (virtual machines are acceptable) to avoid single‑point failures. Network uses 10 GbE, disks are large, memory around 128 GB (not overly large), and CPUs with many cores to handle compaction and compression workloads.

Redis Front‑End Cache Layer

A Redis cluster (8 nodes, 800 GB total memory) caches the hottest ~20 % of HBase data, providing a stable read path even when HBase experiences issues.

HBase Configuration Optimization

After hardware planning, the next step is to tune HBase configuration to fully exploit the resources.

Region Planning

Official recommendations: region size 10‑30 GB, 20‑200 regions per RegionServer. Large regions reduce RPC overhead but increase compaction cost; small regions improve load balancing but cause frequent flushes.

Maximum total data per RegionServer ≈ 18 TB (200 × 30 GB) under default settings.

Key parameters: hbase.hregion.max.filesize=30G Approx. 200 regions per node

Memstore Flush Configuration

Memstore holds writes in memory until a flush to disk occurs. Flush can be triggered at four levels:

Memstore‑level: when a single MemStore reaches hbase.hregion.memstore.flush.size (default 128 MB).

Region‑level: when total MemStore size reaches hbase.hregion.memstore.block.multiplier × hbase.hregion.memstore.flush.size (default 2 × 128 MB).

RegionServer‑level: when all MemStores together exceed hbase.regionserver.global.memstore.upperLimit × JVM heap (default 0.4 × heap).

HLog count limit: exceeding hbase.regionserver.maxlogs forces a flush.

Important parameters:

hbase.hregion.memstore.flush.size=256M

hbase.hregion.memstore.block.multiplier=3

hbase.regionserver.global.memstore.upperLimit=0.6

hbase.regionserver.global.memstore.lowerLimit=0.55

Memory Planning

HBase offers two cache modes:

LRUBlockCache – suitable for write‑heavy, read‑light workloads.

BucketCache – suitable for read‑heavy, write‑light workloads.

We choose BucketCache to maximize off‑heap memory usage and reduce GC impact for real‑time services.

The "Disk/JavaHeap Ratio" concept helps balance memory and disk resources:

DiskSize / JavaHeap = RegionSize / MemstoreSize × ReplicationFactor × HeapFractionForMemstore × 2

With our hardware (18 TB usable disk) and desired RegionSize = 30 GB, MemstoreSize = 256 MB, ReplicationFactor = 3, we estimate a Java heap of 40 GB (HeapFraction ≈ 0.6) and allocate the remaining memory to off‑heap BucketCache.

Read Cache Configuration

In BucketCache mode, the JVM memory is split into:

CombinedBlockCache = LRUBlockCache (metadata) + BucketCache (data)

MemStore (write cache)

Other runtime memory

Key parameters: hbase.bucketcache.size=96 * 1024M (≈ 96 GB off‑heap)

hbase.bucketcache.ioengine=offheap

hbase.bucketcache.percentage.in.combinedcache=0.9

hfile.block.cache.size=0.15

Other HBase Server Configurations

Application‑layer response tuning:

hbase.master.handler.count=256

hbase.regionserver.handler.count=256

hbase.client.retries.number=3

hbase.rpc.timeout=5000

hbase.hstore.blockingStoreFiles=100

HDFS related tuning:

dfs.datanode.handler.count=64

dfs.datanode.max.transfer.threads=12288

dfs.namenode.handler.count=256

dfs.namenode.service.handler.count=256

Configuration Summary

RegionServer JavaHeap: 40 GB

hbase.hregion.max.filesize=30G

hbase.hregion.memstore.flush.size=256M

hbase.hregion.memstore.block.multiplier=3

hbase.regionserver.global.memstore.upperLimit=0.6

hbase.regionserver.global.memstore.lowerLimit=0.55

hbase.bucketcache.size=64 * 1024M

hbase.bucketcache.ioengine=offheap

hbase.bucketcache.percentage.in.combinedcache=0.9

hfile.block.cache.size=0.15

hbase.master.handler.count=256

hbase.regionserver.handler.count=256

hbase.client.retries.number=3

hbase.rpc.timeout=5000

hbase.hstore.blockingStoreFiles=100

Application Layer Usage Optimization

Query Scenarios

Batch Query – Spark DB Connector simplifies bulk reads from HBase:

val rdd = sc.fromHBase[(String, String, String)]("mytable")
      .select("col1", "col2")
      .inColumnFamily("columnFamily")
      .withStartRow("startRow")
      .withEndRow("endRow")

Done!

Real‑time Query – In Spark Streaming, reuse a single HBase connection per job using a lazy‑loaded singleton:

// Define a serializable sink with lazy connection
class HBaseSink(zhHost: String, confFile: String) extends Serializable {
  lazy val connection = {
    val hbaseConf = HBaseConfiguration.create()
    hbaseConf.set(HConstants.ZOOKEEPER_QUORUM, zhHost)
    hbaseConf.addResource(confFile)
    val conn = ConnectionFactory.createConnection(hbaseConf)
    sys.addShutdownHook { conn.close() }
    conn
  }
}

Instantiate the sink on the driver, broadcast it, and each executor uses connection lazily, avoiding serialization issues.

Write Scenarios

Batch Write – Use the same Spark DB Connector for bulk inserts:

rdd.toHBase("mytable")
      .insert("col1", "col2")
      .inColumnFamily("columnFamily")
      .save()

For massive real‑time writes, bulkload is recommended to reduce GC pressure and avoid RegionServer crashes.

Real‑time Write – Reuse the lazy HBaseSink connection as described for queries.

hbase‑env.sh Advanced Client Settings

-XX:+UseG1GC
-XX:InitiatingHeapOccupancyPercent=65
-XX:-ResizePLAB
-XX:MaxGCPauseMillis=90
-XX:+UnlockDiagnosticVMOptions
-XX:+G1SummarizeConcMark
-XX:+ParallelRefProcEnabled
-XX:G1HeapRegionSize=32m
-XX:G1HeapWastePercent=20
-XX:ConcGCThreads=4
-XX:ParallelGCThreads=16
-XX:MaxTenuringThreshold=1
-XX:G1MixedGCCountTarget=64
-XX:+UnlockExperimentalVMOptions
-XX:G1NewSizePercent=2
-XX:G1OldCSetRegionThresholdPercent=5

hbase‑site.xml RegionServer Advanced Settings (Safety Valve)

<property>
    <name>hbase.regionserver.wal.codec</name>
    <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
<property>
    <name>hbase.region.server.rpc.scheduler.factory.class</name>
    <value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
    <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
    <name>hbase.rpc.controllerfactory.class</name>
    <value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
    <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
    <name>hbase.regionserver.thread.compaction.large</name>
    <value>5</value>
</property>
<property>
    <name>hbase.regionserver.region.split.policy</name>
    <value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
</property>

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data HBase Cluster Deployment Memory Tuning Spark Integration Configuration Optimization

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.