Dubbo Cluster Fault Tolerance Explained with Source Code Walkthrough
This article walks through Dubbo’s cluster fault‑tolerance mechanism, detailing the roles of Directory, Router, and LoadBalance, illustrating each step with architecture diagrams and source‑code snippets, and clarifying how different fault‑tolerance strategies like failover, failsafe, and load‑balancing are implemented.
Preface
The author intended to complete a full Dubbo source‑code analysis series, but after a friend’s interview questions revealed gaps in understanding, the plan was adjusted to publish regular, focused articles. This piece covers the crucial concept of cluster fault tolerance and assumes the reader has some Dubbo usage experience and has read the official documentation on fault‑tolerant strategies.
Initial Setup
An overview diagram from the official site is shown (the image illustrates the architecture of cluster fault tolerance). Three key terms repeatedly appear throughout the article: Directory , Router , and LoadBalance . A “map” with numbered steps is provided to guide readers through the source‑code analysis.
Official architecture diagram
Environment Preparation
Two providers are started (one on a VM, one locally). The specific source version used is 2.5.4, available on GitHub.
Running the Example
The demo uses dubbo-demo-consumer. The injected object is a JDK dynamic proxy, as shown in the diagram.
The proxy’s invoke method uses JDK dynamic proxy.
Following the “map”, the execution reaches MockClusterInvoker (step 1).
At this point the first keyword Directory appears.
The code reaches AbstractDirectory (step 3).
The methodInvokerMap is used to retrieve invokers.
Next, the second keyword Router appears (step 6) via MockInvokersSelector, an implementation of the Router interface.
The method getNormalInvokers returns the normal invokers (step 7).
At this stage the article summarizes the two actions performed so far:
Find all invokers in the Directory.
Filter them through the Router to keep only those that can execute normally.
When multiple healthy invokers remain, the third keyword LoadBalance decides which one to invoke (step 11). The default strategy is random, which degrades to round‑robin when only two providers exist.
In case of cluster call failure, Dubbo provides several fault‑tolerance strategies, defaulting to failover retry.
Depending on configuration, the flow may reach classes such as FailoverClusterInvoker, FailsafeClusterInvoker, FailbackClusterInvoker, ForkingClusterInvoker, or BroadcastClusterInvoker.
Finally, LoadBalance (step 13) selects the invoker based on the configured strategy, though a small bug exists in version 2.5.4.
In summary, the three core steps of Dubbo’s cluster fault tolerance are:
Locate all invokers in the Directory.
Filter them via the Router to keep only executable invokers.
Apply LoadBalance to choose the final invoker according to the configured strategy.
Conclusion
The article will continue with deeper analyses of the three keywords and other Dubbo internals. Readers are invited to follow the author’s future posts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
