Databases 4 min read

How HBase Locates Data and Manages Writes: Regions, Meta Table, and ZooKeeper

This article explains how HBase finds the correct region server for a given row key using the hbase:meta table stored in ZooKeeper, and describes the write path involving MemStore, HLog, StoreFile creation, and subsequent maintenance tasks.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
How HBase Locates Data and Manages Writes: Regions, Meta Table, and ZooKeeper

Read Data

HBase tables are split into region blocks that reside on various regionservers.

To retrieve a user record with row key row0001 , the client must first locate the region containing that row.

How does HBase pinpoint the exact region on a specific regionserver?

HBase maintains an internal hbase:meta table that records detailed information for every region of every table, such as the start key, end key, and the address of the server hosting the region.

The hbase:meta table acts like a directory, enabling fast location of the actual data.

The hbase:meta table is stored in ZooKeeper , so a client first contacts ZooKeeper to obtain the meta table, queries it to find which regionserver and which region hold the target data, and then reads from that region.

Because this lookup path can be long, the client caches the retrieved location information for quicker subsequent reads.

Write Data

Write operations are assigned to the appropriate regionserver. First, recall the structure of a regionserver.

From the client’s perspective, a write is straightforward: after the write request reaches the regionserver, the modifications are first written to MemStore and HLog . Once successfully written, the client is notified of completion.

MemStore is an in‑memory cache that holds recent updates. HLog is a log file that records all update operations.

The system then periodically flushes MemStore contents to disk, creating a new StoreFile , clears the cache, and marks the corresponding entries in HLog as persisted.

This makes the data durable, but write operations introduce follow‑up issues such as growing HLog files, increasing numbers of StoreFiles, and expanding region sizes, prompting additional maintenance work:

The system regularly cleans HLog files, removing records that have already been flushed to StoreFiles.

When the number of StoreFiles exceeds a threshold, a compaction merges them into a larger file; if the merged file becomes too large, it is split again.

When a region reaches its size limit, it is split into a new region, and the HMaster manages its allocation to suitable regionservers.

After region changes, the system updates the hbase:meta table accordingly.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HBaseMeta TableRegionwrite()read()
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.