How a Dubbo 2.7.12 Bug Caused Memory Leaks and Service Outages – Diagnosis and Fix

After a late‑night incident where a Dubbo 2.7.12 service crashed, the author traced high memory and CPU usage to a full GC spike, identified a HashedWheelTimer thread‑pool bug causing request timeouts to be missed, reproduced the leak, and confirmed the issue was fixed in Dubbo 2.7.13.

Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
How a Dubbo 2.7.12 Bug Caused Memory Leaks and Service Outages – Diagnosis and Fix

Background

One night the author received a call that a Dubbo service had failed. The system consists of three services (A, B, C) that communicate via Dubbo RPC. When the incident occurred, several instances of service B were dead, causing the remaining instances to experience a surge in request volume and latency.

Investigation

Monitoring showed that the problematic machines had memory usage around 80% and increased CPU consumption. Full GC time also rose sharply, indicating a memory leak.

The JVM full‑GC monitor confirmed the spike.

A WARN log from HashedWheelTimer showed a RejectedExecutionException:

[dubbo-future-timeout-thread-1] WARN org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelTimeout (HashedWheelTimer.java:651)
-  [DUBBO] An exception was thrown by TimerTask., dubbo version: 2.7.12, current host: xxx.xxx.xxx.xxx
java.util.concurrent.RejectedExecutionException:
Task org.apache.dubbo.remoting.exchange.support.DefaultFuture$TimeoutCheckTask$Lambda$674/1067077932@13762d5a
rejected from java.util.concurrent.ThreadPoolExecutor@7a9f0e84[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 21]

The service was using Dubbo version 2.7.12. The author searched related GitHub issues (e.g., #6820, #8172, #8188) but the exact symptom was not documented.

To reproduce, three services were set up and the provider was forced to block indefinitely:

Thread.sleep(Integer.MAX_VALUE);

When a provider was killed, the shared executor was closed, causing subsequent timeout‑check tasks to be rejected. The HashedWheelTimer relies on this thread pool to detect request timeouts. If the pool is closed, timeout detection stops, leading to unchecked requests that eventually exhaust memory.

The following code shows how tasks are submitted to the timer:

public void expire() {
    if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
        return;
    }
    try {
        task.run(this);
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
        }
    }
}

By continuously sending requests to a blocked provider, the author reproduced a memory‑blowout scenario; when the timeout detection worked, memory remained low.

Conclusion

The issue occurs only under asynchronous calls where the provider is abnormally taken offline and remains blocked, causing requests to never return. The bug was introduced in Dubbo 2.7.10 and fixed in 2.7.13.

Post‑mortem

When performing damage control, retain the incident scene (e.g., dump memory or capture traffic) before restarting services.

Observability is crucial: logs, request metrics, machine metrics (CPU, memory, network), and JVM metrics (thread pools, GC) should be comprehensive.

Open‑source projects often have searchable logs; leveraging community knowledge can dramatically reduce troubleshooting time.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaDubboThreadPoolMemory Leakbackend debuggingHashedWheelTimer
Xiao Lou's Tech Notes
Written by

Xiao Lou's Tech Notes

Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.