Why Did Our Dubbo Gateway Leak Memory? A Deep Dive into Zookeeper Subscriptions

This article analyzes a persistent full‑GC and CPU spike issue in a Dubbo microservice gateway, tracing the root cause to repeated Zookeeper subscriptions caused by timestamp‑varying URLs and offering a simple fix by disabling reference checks.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Why Did Our Dubbo Gateway Leak Memory? A Deep Dive into Zookeeper Subscriptions

Background

In a microservice architecture each service has its own network address, while clients call through a unified address, requiring a microservice gateway. The gateway connects clients to services, provides unified authentication, manages interface lifecycles, load balancing, circuit breaking, monitoring, and risk control.

The article focuses on an internally developed Dubbo gateway that converts HTTP to Dubbo protocol, with the core point being Dubbo generic invocation. Generic invocation is used when the client lacks API interfaces and model classes; all POJOs are represented as maps.

Problem Description

After stable operation, the gateway began experiencing frequent full GC, rising CPU usage, and increasing error rates despite unchanged request volume. A machine restart and memory dump analysis yielded no immediate conclusions.

Investigation

Start from the memory dump using Eclipse MAT.

The dump showed over 7,000 RegistryDirectory objects, suggesting a possible memory leak. Further tracing identified many instances of CuratorWatcherImpl from Zookeeper client code.

Source code inspection revealed that CuratorWatcherImpl is created only once per URL subscription. Two subscription points were found: one in ReferenceConfig via RegistryProtocol.doRefer, and another in FailbackRegistry which retries failed paths.

Testing showed that reconnection does not create new CuratorWatcherImpl objects because they are cached per URL.

Search online for similar issues; found GitHub issues #376 and #4587, but they did not match the observed behavior.

The gateway caches references using interface+version+group+timeout as the key, so duplicate reference creation should not occur.

Inspired by a Netty off‑heap memory leak article, a monitoring snippet was added to log Zookeeper subscriptions via reflection on the zkListeners field of ZookeeperRegistry.

Logs revealed that a particular service without a provider was being subscribed repeatedly, with each subscription URL differing only by a timestamp. This caused an exponential growth of RegistryDirectory objects (1+2+3+…+n).

The root cause was that when check=true and a provider is missing, createProxy throws an exception, leading to repeated Zookeeper subscriptions and memory buildup.

Solution

Changing check=true to check=false stops the repeated subscriptions and resolves the memory leak.

Summary

This was a Dubbo bug; version 2.7.5 removed the timestamp from subscription URLs, preventing duplicate subscriptions.

When using generic invocation, set reference check to false to avoid memory leaks; normal XML configuration is unaffected because check=true causes startup failure if no provider exists.

Issue difficulty: directly traceable via monitoring, code, and logs < reproducible < intermittently reproducible < non‑reproducible; this case was intermittently reproducible.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaDubbomemory leakMicroservice
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.