Build a Production‑Ready Retry + Circuit Breaker Decorator in Python

This article shows how to create a Python decorator that automatically retries flaky API calls with exponential backoff and activates a circuit‑breaker to prevent cascading failures, providing a lightweight, zero‑intrusion solution for stable automated testing pipelines.

Test Development Learning Exchange
Test Development Learning Exchange
Test Development Learning Exchange
Build a Production‑Ready Retry + Circuit Breaker Decorator in Python

In API automated testing, intermittent failures such as 502 Bad Gateway or timeouts often cause flaky test results. Manually rerunning tests is inefficient, so a programmatic solution that automatically retries and protects downstream services is needed.

Core Goals

Automatic retry for recoverable errors (default 3 attempts).

Exponential backoff (1 s → 2 s → 4 s) to avoid overwhelming the service.

Circuit‑breaker protection: after N consecutive failures the circuit opens and rejects further calls.

Automatic recovery: after a cooldown period (e.g., 30 s) the circuit half‑opens and tests the service again.

Zero intrusion: apply a single decorator to test functions without changing business logic.

Circuit‑Breaker States

The breaker has three states:

Closed : normal operation, failures are counted.

Open : failure threshold exceeded, all requests are rejected immediately.

Half‑Open : after the recovery timeout, a single trial request is allowed; success returns to Closed , failure goes back to Open .

Implementation

import time
import functools
from datetime import datetime, timedelta
from typing import Callable, Any

# Global storage for circuit‑breaker instances (keyed by function name)
_circuit_breakers = {}

class CircuitBreaker:
    def __init__(self, failure_threshold=3, recovery_timeout=30):
        self.failure_threshold = failure_threshold  # failures to trigger open
        self.recovery_timeout = recovery_timeout    # seconds to stay open
        self.failure_count = 0
        self.last_failure_time = None
        self.state = "CLOSED"  # CLOSED, OPEN, HALF_OPEN

    def can_execute(self) -> bool:
        if self.state == "CLOSED":
            return True
        if self.state == "OPEN":
            if datetime.now() - self.last_failure_time > timedelta(seconds=self.recovery_timeout):
                self.state = "HALF_OPEN"
                return True
            return False
        if self.state == "HALF_OPEN":
            return True
        return False

    def record_success(self):
        self.failure_count = 0
        self.state = "CLOSED"

    def record_failure(self):
        self.failure_count += 1
        self.last_failure_time = datetime.now()
        if self.failure_count >= self.failure_threshold:
            self.state = "OPEN"

def retry_with_circuit_breaker(
    max_retries: int = 3,
    delay: float = 1.0,
    backoff: float = 2.0,
    exceptions=(TimeoutError, ConnectionError, Exception),
    failure_threshold: int = 3,
    recovery_timeout: int = 30,
):
    """Decorator that adds automatic retry + circuit‑breaker to a function.
    :param max_retries: maximum retry attempts
    :param delay: initial delay in seconds
    :param backoff: multiplier for exponential backoff
    :param exceptions: exception types that trigger a retry
    :param failure_threshold: failures needed to open the circuit
    :param recovery_timeout: seconds the circuit stays open
    """
    def decorator(func: Callable) -> Callable:
        func_key = f"{func.__module__}.{func.__name__}"
        @functools.wraps(func)
        def wrapper(*args, **kwargs) -> Any:
            if func_key not in _circuit_breakers:
                _circuit_breakers[func_key] = CircuitBreaker(
                    failure_threshold=failure_threshold,
                    recovery_timeout=recovery_timeout,
                )
            cb = _circuit_breakers[func_key]
            if not cb.can_execute():
                raise Exception(f"Circuit breaker is open ({func_key}), try later")
            current_delay = delay
            last_exception = None
            for attempt in range(max_retries + 1):
                try:
                    result = func(*args, **kwargs)
                    cb.record_success()
                    return result
                except exceptions as e:
                    last_exception = e
                    cb.record_failure()
                    if attempt < max_retries and cb.state != "OPEN":
                        print(f"⚠️ {func_key} attempt {attempt + 1} failed: {e}, retry after {current_delay}s")
                        time.sleep(current_delay)
                        current_delay *= backoff
                    else:
                        break
            raise last_exception
        return wrapper
    return decorator

Usage Example

import requests

@retry_with_circuit_breaker(
    max_retries=2,
    delay=1,
    backoff=2,
    exceptions=(requests.Timeout, requests.ConnectionError, ValueError),
    failure_threshold=2,
    recovery_timeout=20,
)
def get_user_info(user_id: int):
    resp = requests.get(f"https://api.example.com/users/{user_id}", timeout=3)
    if resp.status_code >= 500:
        raise ValueError(f"Server error: {resp.status_code}")
    resp.raise_for_status()
    return resp.json()

def test_fetch_user():
    user = get_user_info(123)
    assert user["id"] == 123

Simulation Scenario

Assume get_user_info(123) is unstable:

T+0 s – first request returns 502; decorator retries after 1 s.

T+1 s – second request times out; retry after 2 s.

T+3 s – third request again returns 502; failure threshold reached, circuit opens.

T+10 s – a new request is blocked immediately with “circuit breaker is open”.

T+35 s – after the recovery timeout the circuit half‑opens, the next request succeeds and the circuit returns to Closed .

In CI/CD this means early flaky failures are isolated, preventing a cascade of red tests.

Advanced Tips

Fine‑grained circuit breaking per URL by including the URL in func_key.

Integrate logging/metrics (e.g., Prometheus, ELK) inside record_failure for observability.

Support async functions by detecting iscoroutinefunction and returning an async wrapper.

Combine with pytest‑rerunfailures for whole‑test retries; this decorator focuses on individual API calls.

Precautions

Do not retry non‑idempotent operations such as payments or SMS sending.

Choose sensible thresholds: too low triggers frequent opens, too high reduces protection.

The provided implementation stores state in‑process; for multi‑process or distributed environments use a shared store like Redis.

Conclusion

Adding this decorator to your test utilities gives automatic retry with exponential backoff, fast‑fail circuit breaking, and self‑healing recovery, turning flaky API tests into stable, reliable checks with just one line of code.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

backendretrycircuit breakerDecoratorautomated-testing
Test Development Learning Exchange
Written by

Test Development Learning Exchange

Test Development Learning Exchange

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.