Mastering the Circuit Breaker Pattern: Design, Implementation, and Testing
This article explains the circuit breaker pattern for distributed systems, detailing its problem context, state machine solution, implementation in C#, key considerations, usage scenarios, and comprehensive unit tests, illustrating how to improve system resilience and prevent cascading failures.
Problem Origin
In large distributed systems, calls to remote services or resources may fail due to slow network connections, resource contention, or temporary unavailability. These failures can recover after a short time, but in some cases they persist, causing parts of the system to become unresponsive or even leading to total service outage.
When a server is heavily loaded, failures can trigger cascading failures: concurrent requests block while waiting for time‑outs, exhausting memory, threads, database connections, and other critical resources, which then affect unrelated parts of the system. Returning an error immediately instead of waiting for a timeout can be a better strategy, and retrying should only occur when the service is likely to succeed.
Solution
The Circuit Breaker pattern prevents an application from repeatedly invoking operations that are likely to fail. It acts as a proxy that records recent error counts and decides whether to allow the operation or return an error immediately.
The pattern has three states:
Closed : Calls go through. The proxy tracks failure count; if the count exceeds a threshold within a time window, it switches to Open.
Open : Calls are rejected immediately, and a timer starts. When the timeout expires, the state changes to Half‑Open.
Half‑Open : A limited number of calls are allowed. If they succeed enough times, the circuit returns to Closed; any failure sends it back to Open.
State transitions are illustrated in the following diagram:
In the Closed state, the error counter is time‑based and resets automatically, preventing occasional glitches from opening the circuit. The Open state uses a timer to move to Half‑Open after a configurable period. The Half‑Open state resets the consecutive‑success counter; successful calls increment it, and reaching the success threshold closes the circuit, while any failure reopens it.
Factors to Consider
Exception handling: decide how to degrade functionality or fallback when a protected service is unavailable.
Exception types: differentiate between transient time‑outs and permanent failures to set appropriate thresholds.
Logging: record failed and successful requests for monitoring.
Health checks: optionally ping the remote service in Open state to detect recovery.
Manual reset: allow administrators to force the circuit into a desired state.
Concurrency: ensure the breaker does not become a bottleneck under high concurrent load.
Resource heterogeneity: treat failures from different shards or partitions separately.
Fast‑fail criteria: use error messages (e.g., HTTP 503) to trigger immediate opening.
Retry of failed requests after recovery.
Use Cases
Protect calls to unreliable remote services or shared resources.
Not suitable for local in‑process resources (e.g., in‑memory data structures) or as a replacement for business‑logic exception handling.
Implementation
The following C# code demonstrates a simple circuit breaker using a state‑machine approach.
public abstract class CircuitBreakerState
{
protected CircuitBreakerState(CircuitBreaker circuitBreaker)
{
this.circuitBreaker = circuitBreaker;
}
/// <summary>
/// Called before the protected method is invoked.
/// </summary>
public virtual void ProtectedCodeIsAboutToBeCalled()
{
// If circuit is open, reject immediately
if (circuitBreaker.IsOpen)
{
throw new OpenCircuitException();
}
}
/// <summary>
/// Called after the protected method succeeds.
/// </summary>
public virtual void ProtectedCodeHasBeenCalled()
{
circuitBreaker.IncreaseSuccessCount();
}
/// <summary>
/// Called when the protected method throws an exception.
/// </summary>
public virtual void ActUponException(Exception e)
{
// Increment failure count and store exception
circuitBreaker.IncreaseFailureCount(e);
// Reset consecutive success count
circuitBreaker.ResetConsecutiveSuccessCount();
}
protected readonly CircuitBreaker circuitBreaker;
}Closed state implementation:
public class ClosedState : CircuitBreakerState
{
public ClosedState(CircuitBreaker circuitBreaker)
: base(circuitBreaker)
{
// Reset failure counter
circuitBreaker.ResetFailureCount();
}
public override void ActUponException(Exception e)
{
base.ActUponException(e);
// Switch to Open if threshold reached
if (circuitBreaker.FailureThresholdReached())
{
circuitBreaker.MoveToOpenState();
}
}
}Open state implementation:
public class OpenState : CircuitBreakerState
{
private readonly Timer timer;
public OpenState(CircuitBreaker circuitBreaker)
: base(circuitBreaker)
{
timer = new Timer(circuitBreaker.Timeout.TotalMilliseconds);
timer.Elapsed += TimeoutHasBeenReached;
timer.AutoReset = false;
timer.Start();
}
// After timeout, move to Half‑Open
private void TimeoutHasBeenReached(object sender, ElapsedEventArgs e)
{
circuitBreaker.MoveToHalfOpenState();
}
public override void ProtectedCodeIsAboutToBeCalled()
{
base.ProtectedCodeIsAboutToBeCalled();
throw new OpenCircuitException();
}
}Half‑Open state implementation:
public class HalfOpenState : CircuitBreakerState
{
public HalfOpenState(CircuitBreaker circuitBreaker)
: base(circuitBreaker)
{
// Reset success counter
circuitBreaker.ResetConsecutiveSuccessCount();
}
public override void ActUponException(Exception e)
{
base.ActUponException(e);
// Any failure returns to Open
circuitBreaker.MoveToOpenState();
}
public override void ProtectedCodeHasBeenCalled()
{
base.ProtectedCodeHasBeenCalled();
// If enough successes, close the circuit
if (circuitBreaker.ConsecutiveSuccessThresholdReached())
{
circuitBreaker.MoveToClosedState();
}
}
}The main CircuitBreaker class holds counters, thresholds, timeout, and the current state:
public class CircuitBreaker
{
private readonly object monitor = new object();
private CircuitBreakerState state;
public int FailureCount { get; private set; }
public int ConsecutiveSuccessCount { get; private set; }
public int FailureThreshold { get; private set; }
public int ConsecutiveSuccessThreshold { get; private set; }
public TimeSpan Timeout { get; private set; }
public Exception LastException { get; private set; }
public bool IsClosed => state is ClosedState;
public bool IsOpen => state is OpenState;
public bool IsHalfOpen => state is HalfOpenState;
internal void MoveToClosedState() => state = new ClosedState(this);
internal void MoveToOpenState() => state = new OpenState(this);
internal void MoveToHalfOpenState() => state = new HalfOpenState(this);
internal void IncreaseFailureCount(Exception ex)
{
LastException = ex;
FailureCount++;
}
internal void ResetFailureCount() => FailureCount = 0;
internal bool FailureThresholdReached() => FailureCount >= FailureThreshold;
internal void IncreaseSuccessCount() => ConsecutiveSuccessCount++;
internal void ResetConsecutiveSuccessCount() => ConsecutiveSuccessCount = 0;
internal bool ConsecutiveSuccessThresholdReached() => ConsecutiveSuccessCount >= ConsecutiveSuccessThreshold;
public CircuitBreaker(int failedThreshold, int consecutiveSuccessThreshold, TimeSpan timeout)
{
if (failedThreshold < 1 || consecutiveSuccessThreshold < 1)
throw new ArgumentOutOfRangeException("threshold", "Threshold should be greater than 0");
if (timeout.TotalMilliseconds < 1)
throw new ArgumentOutOfRangeException("timeout", "Timeout should be greater than 0");
FailureThreshold = failedThreshold;
ConsecutiveSuccessThreshold = consecutiveSuccessThreshold;
Timeout = timeout;
MoveToClosedState();
}
public void AttemptCall(Action protectedCode)
{
using (TimedLock.Lock(monitor))
{
state.ProtectedCodeIsAboutToBeCalled();
}
try
{
protectedCode();
}
catch (Exception e)
{
using (TimedLock.Lock(monitor))
{
state.ActUponException(e);
}
throw;
}
using (TimedLock.Lock(monitor))
{
state.ProtectedCodeHasBeenCalled();
}
}
public void Close()
{
using (TimedLock.Lock(monitor))
{
MoveToClosedState();
}
}
public void Open()
{
using (TimedLock.Lock(monitor))
{
MoveToOpenState();
}
}
}Unit tests using NUnit verify state transitions. Example test case:
[Test]
public void ClosesIfProtectedCodeSucceedsInHalfOpenState()
{
var stub = new Stub(10);
// Circuit breaker: fail 10 times → Open, 5 s → Half‑Open, need 15 successes to close
var circuitBreaker = new CircuitBreaker(10, 15, TimeSpan.FromMilliseconds(5000));
Assert.That(circuitBreaker.IsClosed);
// Trigger 10 failures
CallXAmountOfTimes(() => AssertThatExceptionIsThrown<ApplicationException>(() => circuitBreaker.AttemptCall(stub.DoStuff)), 10);
Assert.AreEqual(10, circuitBreaker.FailureCount);
Assert.That(circuitBreaker.IsOpen);
Thread.Sleep(6000); // wait for transition to Half‑Open
Assert.That(circuitBreaker.IsHalfOpen);
// 15 successful calls
CallXAmountOfTimes(() => circuitBreaker.AttemptCall(stub.DoStuff), 15);
Assert.AreEqual(15, circuitBreaker.ConsecutiveSuccessCount);
Assert.AreEqual(0, circuitBreaker.FailureCount);
Assert.That(circuitBreaker.IsClosed);
}This test demonstrates that after a series of failures the circuit opens, then after the timeout it moves to Half‑Open, and a sufficient number of successful calls close the circuit again.
Conclusion
In modern applications that depend on external services, uncontrolled failures can quickly exhaust resources and cause cascading outages. The circuit breaker pattern encapsulates remote calls with a state machine, allowing immediate failure responses, controlled retries, and monitoring hooks, thereby improving system stability and reliability.
References
1. 互联网巨头为什么会“宕机”, http://edge.iteye.com/blog/1933145
2. 互联网巨头为什么会“宕机”(二), http://edge.iteye.com/blog/1936151
3. Circuit Breaker, http://martinfowler.com/bliki/CircuitBreaker.html
4. Circuit Breaker Pattern, http://msdn.microsoft.com/en-us/library/dn589784.aspx
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
