Understanding HBase Compaction: Principles, Process, Throttling Strategies, and Optimization Cases
Understanding HBase compaction involves knowing its minor and major merge types, trigger mechanisms, file‑selection policies such as RatioBased and Exploring, throttling controls based on file count, and practical tuning of key parameters to avoid latency spikes, as illustrated by real‑world production cases.
This article provides a comprehensive overview of HBase Compaction, covering its principles, classification, significance, trigger mechanisms, execution flow, file‑selection policies, throttling strategies, and practical tuning cases observed in production.
1. Compaction Overview
HBase stores data using an LSM‑Tree architecture. Writes first go to the Write‑Ahead‑Log (WAL) and an in‑memory MemStore. When certain thresholds are met, a Flush writes the MemStore to disk as an HFile. Over time, many HFiles accumulate, increasing read I/O. Compaction merges smaller HFiles into larger ones to reduce file count and improve read latency.
1.1 Types of Compaction
Minor Compaction: merges a subset of adjacent small HFiles into a larger HFile.
Major Compaction: merges all HFiles of a Store into a single HFile, also cleaning up TTL‑expired, deleted, and over‑versioned data.
Major Compactions are resource‑intensive and are often disabled for automatic triggering in large‑scale workloads, being run manually during off‑peak periods.
1.2 Significance of Compaction
Reduces the number of files, stabilizing random‑read latency.
Improves data locality by pulling remote blocks to the local DataNode.
Deletes expired and deleted data, decreasing storage consumption.
1.3 Trigger Timing
Compaction can be triggered in three main ways:
Periodic background thread : The CompactionChecker runs every hbase.server.thread.wakefrequency * hbase.server.compactchecker.interval.multiplier (default ~2 hrs) to evaluate whether conditions for Minor or Major Compaction are met.
MemStore Flush : After a Flush creates new HFiles, HBase checks if the file count exceeds configured thresholds and may trigger a Minor or Major Compaction.
Manual : Administrators can invoke compact or major_compact via the HBase API, shell, or Master UI.
2. Compaction Process
The overall flow includes:
RegionServer starts a Compaction check thread.
When triggered, a dedicated thread selects candidate HFiles based on size, count, and custom policies.
The selected files are read sequentially, their KeyValues are sorted, and written to a temporary file.
The temporary file is moved to the Store’s data directory, a WAL entry is written, and the original files are archived.
Each step is designed to be idempotent and fault‑tolerant.
2.1 Starting the Compaction Scheduler
// Compaction thread
this.compactSplitThread = new CompactSplitThread(this);
// Background thread to check for compactions; needed if region has not gotten updates in a while.
this.compactionChecker = new CompactionChecker(this, this.threadWakeFrequency, this);
if (this.compactionChecker != null) choreService.scheduleChore(compactionChecker);2.2 Triggering Compaction
Three trigger mechanisms are detailed with example configuration values (e.g., hbase.server.thread.wakefrequency = 10s , hbase.server.compactchecker.interval.multiplier = 1000 ).
2.3 File‑Selection Policies
HBase provides several policies to choose which HFiles to merge:
RatioBasedCompactionPolicy : Scans from the oldest file and stops when the size of the start file is less than ratio * sum(newerFiles) or when the candidate count reaches hbase.store.compaction.min .
ExploringCompactionPolicy : Enumerates all possible contiguous sub‑ranges, selecting the one with the most files (or smallest total size) that satisfies min/max file count and size constraints.
StripeCompactionPolicy : Divides files into key‑range stripes, similar to LevelDB’s level‑based compaction, allowing partial merges per stripe.
Example code for the RatioBased policy:
/**
* @param candidates pre‑filtrate
* @return filtered subset
*/
ArrayList
applyCompactionPolicy(ArrayList
candidates,
boolean mayUseOffPeak, boolean mayBeStuck) throws IOException {
if (candidates.isEmpty()) return candidates;
int start = 0;
double ratio = comConf.getCompactionRatio();
if (mayUseOffPeak) ratio = comConf.getCompactionRatioOffPeak();
int countOfFiles = candidates.size();
long[] fileSizes = new long[countOfFiles];
long[] sumSize = new long[countOfFiles];
for (int i = countOfFiles - 1; i >= 0; --i) {
StoreFile file = candidates.get(i);
fileSizes[i] = file.getReader().length();
int tooFar = i + comConf.getMaxFilesToCompact() - 1;
sumSize[i] = fileSizes[i] + ((i + 1 < countOfFiles) ? sumSize[i + 1] : 0)
- ((tooFar < countOfFiles) ? fileSizes[tooFar] : 0);
}
while (countOfFiles - start >= comConf.getMinFilesToCompact() &&
fileSizes[start] > Math.max(comConf.getMinCompactSize(),
(long) (sumSize[start + 1] * ratio))) {
++start;
}
candidates.subList(0, start).clear();
return candidates;
}Exploring policy (simplified excerpt):
public List
applyCompactionPolicy(final List
candidates,
boolean mightBeStuck, boolean mayUseOffPeak, int minFiles, int maxFiles) {
List
bestSelection = new ArrayList<>(0);
long bestSize = 0;
for (int start = 0; start < candidates.size(); start++) {
for (int currentEnd = start + minFiles - 1; currentEnd < candidates.size(); currentEnd++) {
List
potential = candidates.subList(start, currentEnd + 1);
if (potential.size() < minFiles || potential.size() > maxFiles) continue;
long size = getTotalStoreSize(potential);
if (size > comConf.getMaxCompactSize(mayUseOffPeak)) continue;
if (size >= comConf.getMinCompactSize() && !filesInRatio(potential, currentRatio)) continue;
if (isBetterSelection(bestSelection, bestSize, potential, size, mightBeStuck)) {
bestSelection = potential;
bestSize = size;
}
}
}
return new ArrayList<>(bestSelection);
}2.4 Throttling Compaction
To avoid overwhelming the cluster, HBase limits both compaction speed and bandwidth. The effective throughput is calculated as:
throughput = lowerBound + (upperBound - lowerBound) * ratio;
// lowerBound = hbase.hstore.compaction.throughput.lower.bound (default 10 MB/s)
// upperBound = hbase.hstore.compaction.throughput.higher.bound (default 20 MB/s)
// ratio ∈ [0,1] derived from the current number of HFiles.If the number of HFiles exceeds blockingFileCount , writes are blocked and throttling is disabled.
Bandwidth limit logic:
private void tune(double compactionPressure) {
double maxThroughputToSet;
if (compactionPressure > 1.0) {
maxThroughputToSet = Double.MAX_VALUE; // unlimited when blocking
} else if (offPeakHours.isOffPeakHour()) {
maxThroughputToSet = maxThroughputOffpeak;
} else {
maxThroughputToSet = maxThroughputLowerBound +
(maxThroughputHigherBound - maxThroughputLowerBound) * compactionPressure;
}
this.maxThroughput = maxThroughputToSet;
}3. Real‑World Issues and Tuning Cases
Case 4.1 – Unexpected Major Compaction Queue Growth
Even with automatic Major Compaction disabled, the long‑compaction thread pool filled up because large HFiles (>2.5 GB) caused Minor Compactions to be routed to the long‑compaction queue, increasing read/write latency. The fix was to lower hbase.hstore.compaction.max.size to 2 GB, causing oversized files to be excluded from Minor Compaction and merged later during off‑peak windows.
Case 4.2 – Prolonged Manual Major Compaction
A table with 578 TB of data experienced slow performance due to a single‑threaded Major Compaction pool. Increasing the pool size to 10 threads and adjusting off‑peak hours allowed the compaction to finish faster, reducing storage to 349 TB and restoring latency to acceptable levels.
4. Parameter Reference
The article lists key configuration items such as:
hbase.hstore.compaction.ratio / hbase.hstore.compaction.ratio.offpeak
hbase.hstore.compaction.max.size
hbase.hstore.compaction.throughput.lower.bound and higher.bound
hbase.regionserver.compactchecker.interval.multiplier
hbase.hstore.blockingStoreFiles
Adjusting these parameters should be done gradually and monitored for impact on latency and resource usage.
5. Conclusion
Compaction is essential for HBase performance but involves complex decision‑making, policy selection, and resource throttling. Misconfiguration can lead to write amplification and degraded latency. Understanding the trigger mechanisms, selection algorithms, and throttling controls enables effective tuning, as demonstrated by the two production case studies.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.