Mastering API Protection: Rate Limiting, Caching, and Degradation for E‑Commerce Spikes
When a product suddenly surges in demand, this guide explains how to safeguard e‑commerce APIs using rate‑limiting algorithms (leaky bucket, token bucket, sliding window), Nginx and Java semaphore controls, distributed throttling with message queues, service degradation strategies, and caching techniques to maintain stability.
Set a scenario: if a product API suddenly spikes, what should be done?
For example, after the mascot "Bing Dwen Dwen" trended, tens of thousands of users rushed to place orders on Taobao without any cache warm‑up or preparation, leading to overload.
In high‑concurrency e‑commerce systems, protecting an interface typically involves three measures: caching, rate limiting, and service degradation.
Assume the interface has already passed risk control, filtering out half of the bot requests, leaving only genuine user orders.
Service Rate Limiting
Rate limiting aims to throttle concurrent requests either by limiting request speed or by limiting the number of requests within a time window; once the limit is reached, the service can reject, queue, wait, or degrade the request.
Rate Limiting Algorithms
1. Leaky Bucket Algorithm
The leaky bucket algorithm puts incoming requests into a bucket; if the bucket is full (reaching the limit), requests are discarded or handled by other strategies. The bucket releases requests at a fixed rate, ensuring the service consumption speed never exceeds the defined threshold.
The idea is that regardless of how many requests arrive, the interface’s consumption speed is always less than or equal to the outflow rate.
This can be implemented using a message queue.
2. Token Bucket Algorithm
The token bucket algorithm adds tokens to a bucket at a rate v (v = time period / limit). When a request arrives, it tries to take a token; if successful, the request passes, otherwise the limit strategy is triggered.
The difference from the leaky bucket is that the token bucket allows bursty traffic.
3. Sliding Window Algorithm
The sliding window algorithm divides a time period into N small intervals, records the request count for each interval, and discards expired intervals as time slides.
For example, with a 1‑minute window split into two 30‑second sub‑windows, the first sub‑window may have 75 requests and the second 100. If the sum of all sub‑windows exceeds the threshold (e.g., 100), the limit strategy is triggered.
Implementation examples include Sentinel and TCP sliding windows.
Ingress Layer Rate Limiting
Nginx Rate Limiting
Nginx uses the leaky bucket algorithm for rate limiting.
It can limit access based on client characteristics such as IP or User‑Agent. IP is more reliable because it cannot be forged, whereas User‑Agent can be easily spoofed.
Limit_req module based on IP: Module ngx_http_limit_req_module
tgngine: ngx_http_limit_req_module – The Tengine Web Server
Local Interface Rate Limiting
Semaphore
Java’s Semaphore from the concurrency library can easily control the number of simultaneous accesses to a resource. It acquires a permit before processing and releases it afterward.
Example:
private final Semaphore permit = new Semaphore(40, true);
public void process(){
try{
permit.acquire();
// TODO: handle business logic
} catch (InterruptedException e){
e.printStackTrace();
} finally {
permit.release();
}
}Refer to source code for a concrete Semaphore implementation.
Distributed Interface Rate Limiting
Using Message Queues
Whether using an MQ middleware or Redis List as a message queue, it can serve as a buffering queue based on the leaky bucket principle.
When request volume reaches a certain threshold, a message queue can buffer incoming data and consume it according to the service’s throughput.
Service Degradation
After risk control, if the request concurrency rises sharply, a fallback plan can be activated to degrade the service.
Degradation is typically applied to services or tasks that are not critical or urgent, allowing them to be delayed or paused.
Degradation Strategies
Stop Edge Services
For example, during Taobao’s Double‑11 promotion, queries for orders older than three months might be disabled to preserve core service availability.
Reject Requests
When request volume exceeds the threshold or many failures occur, some requests can be outright rejected.
Rejection Policies
Random rejection: randomly drop requests that exceed the limit.
Reject older requests: prioritize newer requests and drop earlier ones.
Reject non‑core requests: maintain a whitelist of core services and reject everything else.
Recovery Strategies
After degradation, additional consumer services can be registered to handle the surge, and some servers can be gradually re‑loaded.
Specific implementation details can be found in related articles.
Data Caching
When a protected interface experiences a sudden surge, the following steps can be taken:
Use a distributed lock to block access.
Cache hot data in a caching middleware during the short burst.
After releasing the lock, prioritize operations on cached data.
Send the operation results to a consumer via a message queue for asynchronous processing.
Cache Issues
Assume an inventory interface has only 100 items in the database. If all requests start hitting the cache, the cache can still become a bottleneck.
Read‑Write Separation
One approach is read‑write separation using Redis Sentinel cluster mode for master‑slave replication. Reads dominate writes; when inventory reaches zero, read operations can fail fast.
Load Balancing
Another idea is to split the inventory across multiple cache instances. Inspired by ConcurrentHashMap ’s counterCells, 100 items could be divided into 10 caches, each handling 10 items, with requests load‑balanced among them.
However, if most users hash to the same cache, other caches remain idle, leading to inaccurate “out‑of‑stock” responses.
Page Cache
Many software architectures use a page‑cache approach, similar to Linux kernel disk writes or MySQL flushing, where short‑term write operations are aggregated and performed in the cache before being persisted.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
