Fundamentals 27 min read

Applying ZGC in 58.com HBase Cluster: Background, Implementation, Performance Evaluation and Tuning

This article details 58.com’s big‑data team’s migration of HBase clusters from CMS/G1 to the low‑latency Z Garbage Collector, explains ZGC’s design and key techniques, presents performance test results versus CMS, discusses tuning parameters, and shares practical lessons from production deployment.

58 Tech
58 Tech
58 Tech
Applying ZGC in 58.com HBase Cluster: Background, Implementation, Performance Evaluation and Tuning

Introduction – The 58.com big‑data team evaluated high‑version JDK ZGC to address GC pause issues in their online HBase clusters, successfully deploying ZGC on HBase with Tencent Kona JDK11.

Background – Java GC algorithms evolve to meet latency‑sensitive services; long STW pauses in CMS affect HBase region servers, prompting investigation of ZGC.

Common GC Collectors Overview – CMS (mark‑sweep, low pause, CPU‑sensitive), G1 (predictable pause, region‑based, memory overhead), Shenandoah (low pause, higher CPU), and ZGC (low‑pause, concurrent, no generational collection).

ZGC Design and Phases – ZGC aims for ≤10 ms pauses regardless of heap size, using colored pointers and load barriers. It operates in six phases: initial mark, concurrent mark/relocate, remark, concurrent relocate preparation, initial relocate, and concurrent relocate, with only three STW phases.

Key Technologies – Colored pointers embed mark bits in object references, enabling concurrent marking and relocation. Load barriers intercept heap reads to redirect moved objects, providing self‑healing pointers without full‑heap STW.

Implementation Details – ZGC uses 4 bits of a 64‑bit pointer for state flags (Marked0/1, Remapped, Finalizable). It does not support 32‑bit OS or compressed oops.

Performance Tuning – Tuning focuses on reducing STW time, avoiding allocation stalls, and adjusting concurrency threads. Parameters such as -XX:ZCollectionInterval , -XX:ZAllocationSpikeTolerance , and -XX:ConcGCThreads are tuned to balance latency and throughput.

String n = person.name; // Load from heap with load barrier // String p = n; // No barrier needed n.isEmpty(); // No barrier needed int age = person.age; // Primitive, no barrier

Performance Comparison (ZGC vs CMS) – In YCSB tests on a 3‑node HBase cluster, ZGC achieved ≈99.98 % throughput with average pause times around 1.5 ms , compared to CMS’s ~ 150 ms pauses, yielding 5‑24 % throughput improvements across various workloads.

Online Deployment Summary – After upgrading to ZGC, overall GC pause time dropped to ~ 3.7 ms (≈6 % of CMS), throughput slightly increased, and latency peaks for high‑traffic HBase tables were reduced. Issues encountered included allocation stalls, higher CPU usage from concurrent threads, and inflated RSS due to multi‑mapping.

Problem Mitigation – Solutions involved increasing heap size, raising -XX:ZAllocationSpikeTolerance to trigger earlier GC, and increasing -XX:ConcGCThreads to speed up concurrent marking.

Conclusion – ZGC provides significant latency reductions for large‑heap, latency‑sensitive services like HBase, though it requires careful tuning and sufficient memory resources. The team plans to expand ZGC usage across suitable workloads.

JavaperformanceGarbage CollectionZGCHBasetuning
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.