Big Data 15 min read

How Ali‑HBase Cut Young GC Pauses from 120 ms to 5 ms: Inside CCSMap & BucketCache

This article explains how Alibaba engineers reduced young‑generation garbage‑collection pauses in large‑scale HBase deployments from over a hundred milliseconds to just a few milliseconds by redesigning memory management with CCSMap, BucketCache, and the tenant‑aware AliGC algorithm.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Ali‑HBase Cut Young GC Pauses from 120 ms to 5 ms: Inside CCSMap & BucketCache

Background

Garbage collection (GC) in the JVM hides memory‑management details, but for large‑scale storage systems like HBase the pause times caused by GC become a serious real‑time problem. Three main factors aggravate GC in HBase: massive heap sizes (96 GB‑plus, with some nodes exceeding 160 GB), complex object lifecycles due to massive read/write caches, and a high frequency of young‑generation collections.

Approach

1. Application‑level memory self‑management – By manually managing the 70 GB+ of memory used for write buffers and read caches, the effective heap pressure is reduced to that of a 10 GB heap, shrinking young‑GC pauses from 120 ms to 15 ms in production.

2. AliGC – The Alibaba JDK team created a tenant‑based GC algorithm (AliGC) that further cuts young‑GC time to 5 ms in laboratory stress tests.

Key Technologies

CCSMap: Eliminating a Billion Objects

HBase uses an LSM‑Tree model where writes are first stored in an in‑memory write cache (often a ConcurrentSkipListMap). The standard implementation creates many JVM objects, leading to high GC cost. CCSMap (CompactedConcurrentSkipListMap) replaces these objects with a contiguous memory layout stored in chunks, each chunk holding multiple nodes. New entries are appended to the end of used memory, and deletions are logical only.

The design yields:

No JVM objects for map entries, saving at least 16 bytes per object.

20‑30 % higher read/write throughput for 50‑byte key‑value pairs.

~40 % reduction in memory consumption.

Young‑GC pause reduced from 120 ms to 30 ms after deployment.

CCSMap memory structure
CCSMap memory structure

BucketCache: Never‑Promoted Cache

HBase’s block cache stores data blocks (16‑64 KB) in the JVM heap, causing frequent promotions to the old generation and increasing GC pressure. BucketCache allocates a fixed‑size off‑heap memory region, divides it into buckets, and copies loaded blocks into these buckets. When a block is no longer needed, its bucket is marked free for reuse, eliminating promotion overhead.

BucketCache performance
BucketCache performance

AliGC: Multi‑Tenant GC for Large Heaps

AliGC extends the G1 collector with a three‑layer tenant model:

Objects are allocated in isolated tenant regions within the Java heap.

GC can be triggered at the tenant granularity rather than the whole application.

Applications can map business‑level tenants flexibly.

By moving medium‑lifetime objects directly into the old‑generation tenant and using ObjectTrace to identify the heaviest objects, the RSet scanning cost during young‑GC is dramatically reduced. In lab simulations, young‑GC time dropped to 5 µs, and production workloads now see young‑GC pauses around 15 ms.

AliGC performance
AliGC performance

Cloud Deployment

Alibaba Cloud now offers a commercial HBase service that incorporates these optimizations, providing better operations, reliability, performance, stability, security, and cost efficiency.

Conclusion

For engineers interested in large‑scale data storage, distributed databases, or HBase, the authors invite collaboration ([email protected]) and encourage further exploration of these techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HBaseGC optimizationLarge heapAliGCBucketCacheCCSMap
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.