Dubbo Service Discovery OOM Case Study and Memory Leak Analysis
A misconfigured Dubbo consumer created both Dubbo and REST invokers, causing thousands of failed REST invoker attempts that filled a synchronized List of ResteasyClient objects, exhausting the old generation heap and triggering OOM, which was fixed by replacing the List with a WeakHashMap‑based client map.
On an ordinary afternoon in July, a large number of alerts were received indicating that Application A's old generation memory usage exceeded 95% and continued to rise to nearly 100% after 3:30 PM, with the young generation also increasing.
Although the incoming request volume was stable, the root cause needed investigation. Before diving into the fault, the Dubbo service discovery process is introduced.
Dubbo Service Discovery Process – Developers configure a Dubbo XML file or use @Reference to define consumer references. At startup, the configuration is transformed into a ReferenceBean and createProxy creates a proxy for the remote service interface. The consumer subscribes to the company's etcd registry, receives a notify callback, and converts the remote service URL into a local invoker. If the consumer's protocol is set to dubbo, a Dubbo invoker is created; if set to rest, a REST invoker is created; if no protocol is specified, an invoker is generated for each protocol offered by the provider.
After the application starts, any change in the service registry triggers a full update (not incremental) to the consumer. The consumer receives the entire list of provider URLs and attempts to update its local invokers. If an invoker for a URL already exists, it is not recreated.
OOM Localization – Logging into the faulty machine and inspecting the JVM heap revealed a massive number of HashMap entries. A heap dump was generated with jmap and analyzed using MAT. The dominator tree showed that a large object (~2.4 GB) belonged to the RestProtocol class’s internal List.
The relevant code fragment is:
List<ResteasyClient> clients = Collections.synchronizedList(newLinkedList<ResteasyClient>());This list stores ResteasyClient instances created by RestProtocol when building a remote service invoker. The list is only cleared when the RestProtocol object itself is destroyed, causing the client objects to accumulate.
Root Cause – The consumer of ItemLockService did not configure the protocol field, so it attempted to create both Dubbo and REST invokers. The REST invoker creation failed because the service’s annotation was placed on the implementation class rather than the interface, leaving httpMethod null.
When Application B (the provider) was redeployed across 100 machines, the registry sent 100 full‑update notifications to the consumer. For each notification the consumer tried to create a REST invoker for every online URL (100 URLs), all of which failed. Consequently, 100 × 100 = 10,000 ResteasyClient objects were added to the list, exhausting memory and causing OOM. Restarting Application A temporarily cleared the list, but a subsequent redeploy of Application B reproduced the issue.
Recommendations – Developers must configure the protocol parameter correctly. The Dubbo framework should add validation and defensive programming to prevent such misconfigurations. Middleware added code to handle the situation.
Fix Implemented – The problematic List<ResteasyClient> was replaced with a Map<String,ResteasyClient> keyed by URL, ensuring a single client per URL. A WeakHashMap was introduced to allow unused clients to be garbage‑collected. The new implementation looks like:
Map<String,ResteasyClient> clientMap = new ConcurrentHashMap<>();
for (String clazz : Constants.COMMA_SPLIT_PATTERN.split(url.getParameter(Constants.EXTENSION_KEY, ""))) {
if (!StringUtils.isEmpty(clazz)) {
try {
client.register(Thread.currentThread().getContextClassLoader().loadClass(clazz.trim()));
} catch (ClassNotFoundException e) {
throw new RpcException("Error loading JAX-RS extension class: " + clazz.trim(), e);
}
}
}
ResteasyWebTarget target = client.target("http://" + url.getHost() + ":" + url.getPort() + "/" + getContextPath(url));
try {
T t = target.proxy(serviceType);
// invoker created successfully, store client
clientMap.put(url.toFullString(), client);
return t;
} catch (Exception e) {
logger.warn("fail to create proxy,serviceType:{}", serviceType, e);
// invoker creation failed, close client
client.close();
throw e;
}An issue was filed to Apache Dubbo, and a pull request (PR 4629) introduced the Map‑based solution with a WeakHashMap, fully resolving the OOM problem.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
