How to Tackle Interface-Level Failures: Degrade, Circuit Break, Rate‑Limit & Queue
The article explains why interface-level failures occur, distinguishes internal and external causes, and presents four practical strategies—degradation, circuit breaking, rate limiting, and queuing—to keep core services running and protect most users under heavy load.
Interface-level failures refer to situations where the system remains up and the network is intact, but business processing is problematic, such as slow responses, many timeouts, or abnormal accesses.
These issues are usually caused by excessive system pressure or high load, for example slow queries that exhaust database resources, leading to connection, read, or write timeouts.
Root causes include:
Internal reasons : program infinite loops, an API causing slow database queries, bugs that exhaust memory, and similar problems.
External reasons : hacker attacks, flash‑sale traffic spikes, third‑party API latency, and other outside factors.
The core principle for solving interface-level failures is to prioritize core business and the majority of users .
Strategies
1. Degradation
Reduce or disable certain business functions or APIs, keeping core services running.
The idea is “lose a horse to save the king”, i.e., protect core business.
Examples: a forum can be degraded to read‑only mode; an app’s log‑upload API can be turned off temporarily.
Common degradation methods:
System backdoor degradation – provide a special endpoint that, when called with parameters, triggers degradation.
Independent degradation system – use a separate system or distributed configuration center to broadcast degradation commands to all services.
2. Circuit Breaker
Degradation handles self‑faults; circuit breaker handles faults of external systems. When a downstream service B fails, service A stops calling it and returns an error immediately.
Implementation requires a unified API call layer that can sample and aggregate statistics; otherwise scattered calls cannot be managed.
Key is threshold design, e.g., if within one minute more than 30% of requests exceed one second, trigger circuit break.
3. Rate Limiting
Degradation focuses on functional priority, while rate limiting controls incoming traffic pressure.
Only allow traffic the system can handle; excess requests are dropped.
Methods:
Request‑based limiting (external perspective):
Limit total volume – e.g., cap total users in a live room to 1 million.
Limit time‑based volume – e.g., allow only 10 k users per minute, or 100 k requests per second.
Resource‑based limiting (internal perspective): Identify critical internal resources (connections, file handles, threads, request queues) and cap them. Example: using Netty, queue incoming requests up to 10 k; reject beyond that, or reject when CPU usage exceeds 80%.
4. Queuing
Queuing is a variant of rate limiting: instead of rejecting excess traffic, place it in a queue to wait, like the 12306 ticketing system.
Example: a double‑11 flash‑sale queuing system:
Queue module – receives purchase requests and creates a queue per product.
Scheduler module – dynamically checks service capacity and pulls requests from the queue when resources are free.
Service module – processes the business logic and returns results.
Summary
Four common strategies for handling interface‑level failures: degradation (self‑fault), circuit breaker (external fault), rate limiting (traffic pressure), and queuing (a variation of rate limiting that waits instead of rejects).
Content compiled from “Learning Architecture from Scratch”.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
