Backend Development 14 min read

How to Prevent Proxy Overload and Cascading Failures in Backend Systems

This article explains what overload means in server development, analyzes a concrete proxy overload case, identifies root causes, and presents a multi‑layer protection strategy—including self‑protection, downstream safeguards, concurrency control, and improved circuit‑breaker designs—to keep services stable under heavy traffic.

Baidu Maps Tech Team

Jul 19, 2017

How to Prevent Proxy Overload and Cascading Failures in Backend Systems

What is Overload

In server development, overload occurs when the incoming request volume exceeds the system’s maximum processing capacity, causing response times to increase dramatically and potentially leading to a cascading failure (snowball effect).

Overload Symptoms

During overload, each request takes longer to respond; if no protection is in place, accumulated time‑outs create a vicious cycle where the system appears completely unavailable.

Case Study: Proxy Overload

A reverse‑proxy forwards requests from upstream to downstream modules. Module A uses a Reactor pattern with multithreaded forwarding to Module B. Under normal conditions, Module A can handle 1,000 QPS, but when Module B’s processing latency rises from 10 ms to 40 ms, Module A’s effective capacity drops to 250 QPS while upstream traffic remains at 800 QPS.

Consequently, 550 requests per second pile up in the Recv‑Q buffer, filling it in about 4 seconds. Each request then waits roughly 91 seconds before being processed, far exceeding timeout limits, so the proxy’s external throughput drops to zero.

Root Causes of Overload

Downstream module B experiences large‑scale failures or latency spikes.

Upstream traffic surges (e.g., cache breakdown, flash‑sale spikes).

Other modules on the same machine consume excessive CPU or network resources.

All these boil down to request volume > processing capacity .

Proxy Overload Protection Design

1. Protect the Proxy Itself

Use resource isolation to prevent a single downstream failure from affecting the whole proxy. Two approaches:

Thread‑level isolation: Allocate a fixed number of threads to each downstream service; when a service reaches its limit, reject new requests.

Deployment‑level isolation: Deploy separate proxy instances for high‑traffic or high‑SLA downstream services, achieving physical separation.

In practice, deployment isolation is often simpler and more effective.

2. Protect Downstream Services

Choose an appropriate load‑balancing strategy to route traffic to healthier instances.

Configure retry policies carefully, retrying only on connection errors.

Apply concurrency control: set a maximum concurrent request limit per proxy instance (single‑machine control) or globally via a shared counter (e.g., Redis). Each method has trade‑offs regarding simplicity and coordination.

3. Ensure Downstream Recovery

Recovery requires request volume to fall below processing capacity. Two main techniques:

Rate‑limit or shed load aggressively (degrade service).

Use a circuit‑breaker that monitors downstream latency and error rate, opening when thresholds are exceeded and gradually restoring traffic when the service stabilizes.

Traditional circuit‑breakers stop all traffic when opened, which can cause brief outages. An improved approach throttles traffic instead of cutting it off completely, then gradually ramps up as latency improves.

Other Protection Ideas

Beyond code‑level measures, product and operations teams should guide users during overload (e.g., friendly error pages) and implement comprehensive alerting and auto‑scaling mechanisms to expand capacity when thresholds are reached.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Proxy circuit breaker Resource Isolation overload

Written by

Baidu Maps Tech Team

Want to see the Baidu Maps team's technical insights, learn how top engineers tackle tough problems, or join the team? Follow the Baidu Maps Tech Team to get the answers you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.