Databases 10 min read

Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain

The article explains HBase’s write‑ahead‑log architecture, detailing how client put/delete requests travel through RPC to the RegionServer, are processed by MultiRowMutationService, written to the WAL via FSHLog.append and sync, and finally stored in MemStore, while describing durability options and the underlying source‑code call chain.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain

HBase is a highly reliable, high‑performance, column‑oriented, scalable distributed storage system. Using HBase technology, large‑scale structured storage clusters can be built on inexpensive PC servers.

This document explains the simple principle of HBase's WAL and, from a source‑code perspective, analyzes how a "write" request reaches the WAL and what the WAL does. Although the WAL implementation differs significantly across HBase versions, the underlying principle is almost the same. The analysis is based on the currently deployed version (HBase 1.1.3).

Simple Principle

The basic principles of HBase WAL are described clearly and in detail in the *HBase Definitive Guide* and various online tutorials; here we only give a brief description.

HBase is built on an LSM‑tree storage model. It uses log files and in‑memory structures to convert random writes into sequential writes, thereby guaranteeing a stable data‑insertion rate. The log file referred to here is the WAL file, which is used to roll back un‑persisted data after a server crash.

WAL (Write‑Ahead‑Log) is a log used by a RegionServer during data insertion and deletion to record operation details. The general process is illustrated in the diagram below: the client starts an operation that modifies data, each modification is wrapped in a KeyValue object and sent via RPC to the HRegionServer that holds the matching region. Once the KeyValue arrives, it is handed to the corresponding HRegion instance. The data is first written to the WAL, then placed into the MemStore of the actual storage file. If the MemStore becomes full, it is flushed to disk.

WAL Call Chain Source Code Analysis

This section analyzes the write process of HBase from the source‑code point of view, as briefly described above.

The basic call flow is shown in the following diagram:

From the sequence diagram we can roughly see:

1. The client first packages put/delete API calls into a List , then uses the protobuf protocol to send them via RPC to the appropriate HRegionServer . The server invokes execRegionServerService() to parse the protobuf binary packet, finds the corresponding service by serviceName , and calls callMethod to execute it.

2. The write operations use the MultiRowMutationService . Inside this service, the mutateRows() method processes the List . The actual implementation class is MultiRowMutationEndpoint , which provides row‑level transaction support. Its main responsibilities are shown in the diagram below.

The mutateRows() method locates the target Region and calls HRegion.mutateRowsWithLocks to perform the actual write.

3. Inside HRegion.mutateRowsWithLocks , the method checks for an existing RowProcessor . If none exists, it creates one and calls processRowsWithLocks() . This method is the core of the write operation: it handles WAL writing, WAL flushing, and MemStore writing. The process includes exception handling and consists of 14 steps.

The method signature is shown below:

The processor implementation class is MultiRowMutationProcessor .

Although processRowsWithLocks contains many steps, the most critical ones are highlighted in the following diagram:

At this point, HRegion locks the region using a two‑phase locking mechanism that acquires all row‑level locks involved in the write.

The List is placed into the internal structure, but the actual write to MemStore occurs only after the sync() method flushes the WAL (or WALEdit ) to disk. After the sync completes, the worker thread writes the data to MemStore, completing one "write" operation.

In this step, HRegion appends the prepared WALEdit to the log via FSHLog.append . Because the log file is cached in memory, a subsequent sync call is required to persist the data to disk. The WALEdit is first placed into an LMAX Disruptor RingBuffer—a thread‑safe queue that coordinates multiple producers (the many concurrent append calls) with a single consumer that invokes sync to flush the buffer to disk, guaranteeing a globally ordered WAL.

For details on the LMAX Disruptor RingBuffer, see the article at https://github.com/LMAX-Exchange/disruptor/wiki/Introduction.

In this step, syncOrDefer is invoked. Apart from the meta region, syncOrDefer decides whether to call FSHLog.sync based on the durability level set by the client.

HBase allows the WAL durability level to control whether the WAL mechanism is enabled and how the HLog is persisted.

Clients can set the durability level, for example:

put.setDurability(Durability.SYNC_WAL);

In version 1.1.3, the WAL durability levels are:

USER_DEFAULT

: If the user does not specify a durability, HBase defaults to

SYNC_WAL

.

SKIP_WAL

: Write only to cache, not to HLog. Improves performance but risks data loss.

ASYNC_WAL

: Asynchronously write data to HLog.

SYNC_WAL

: Synchronously write data to the log file; may only reach the file system cache.

FSYNC_WAL

: Synchronously write data to the log file and force a disk flush. This is the strictest level, guaranteeing no data loss at the cost of performance.

Both SYNC_WAL and FSYNC_WAL ultimately invoke FSHLog.sync() , a blocking call that waits until the data is truly flushed to disk before the worker thread returns and writes to MemStore.

Conclusion

HBase is a highly reliable, high‑performance, column‑oriented, scalable distributed storage system. This document first introduced the basic write principle of HBase and then, from a source‑code perspective, provided a straightforward analysis of the write process within a RegionServer. The discussion lays a foundation for deeper study of HBase's write path.

JavaBig DataHBasedistributed-storageWALWrite-Ahead Log
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.