Defending Against Million‑QPS Attacks: Rate Limiting, Fingerprinting & Real‑Time Rules
This article explains how to protect systems from massive malicious traffic reaching millions of queries per second by combining gateway rate limiting, distributed circuit breaking, device fingerprinting, behavior analysis, dynamic rule engines, and real‑time risk scoring, illustrated with Nginx‑Lua, Sentinel, Drools, and Flink examples.
Preface
Today we discuss the ultimate challenge that keeps many developers awake: when malicious traffic floods like a tsunami, how to protect your system?
Some have experienced API throttling nightmares, but attacks at the scale of millions of QPS are a different battle.
This article explores how to defend against API traffic reaching millions of QPS.
Why is a million QPS so deadly?
Illustrated below is the impact of a million QPS attack:
Attackers use three core weapons:
IP Ocean Tactics : 100k+ proxy IP pool rotating dynamically, rendering traditional IP rate limiting ineffective.
Device Cloning : forging browser fingerprints to mimic real devices.
Protocol‑Level Precise Attacks : crafted HTTP requests that bypass basic WAF rules.
The chain reaction that can crash a system includes:
Thread pool 100% occupied → new requests timeout.
Database connections exhausted → SQL execution blocked.
Redis response surge → cache penetration avalanche.
Microservice circuit‑breaker cascade → services unavailable.
First Defense Line: Basic Rate Limiting and Circuit Breaking
1. Gateway Rate Limiting
Implement rate limiting at the gateway, typically using Nginx + Lua.
Example Nginx configuration:
<code>location /api/payment {
access_by_lua_block {
local limiter = require "resty.limit.req"
-- token bucket: 1000 QPS + 2000 burst
local lim, err = limiter.new("payment_limit", 1000, 2000)
if not lim then
ngx.log(ngx.ERR, "Failed to init limiter: ", err)
return ngx.exit(500)
end
-- limit by client IP
local key = ngx.var.remote_addr
local delay, err = lim:incoming(key, true)
if not delay then
if err == "rejected" then
ngx.header.content_type = "application/json"
ngx.status = 429
ngx.say([[{"code":429,"msg":"Too many requests"}]])
return ngx.exit(429)
end
ngx.log(ngx.ERR, "Rate limit error: ", err)
return ngx.exit(500)
end
}
}
</code>Code explanation:
Use
lua-resty-limit-reqmodule from OpenResty.
Token bucket algorithm: 1000 QPS normal traffic + 2000 burst capacity.
Rate limit per client IP.
Exceeding limit returns HTTP 429 with JSON error.
2. Distributed Circuit Breaking
For high traffic, add a distributed circuit‑breaker such as a Sentinel cluster.
Sentinel cluster flow control configuration example:
<code>public class SentinelConfig {
@PostConstruct
public void initFlowRules() {
// create cluster flow rule
ClusterFlowRule rule = new ClusterFlowRule();
rule.setResource("createOrder"); // protected resource
rule.setGrade(RuleConstant.FLOW_GRADE_QPS); // QPS limit
rule.setCount(50000); // 50k QPS cluster threshold
rule.setClusterMode(true); // enable cluster mode
rule.setClusterConfig(new ClusterRuleConfig()
.setFlowId(123) // global unique ID
.setThresholdType(1) // global threshold
);
// load rule
ClusterFlowRuleManager.loadRules(Collections.singletonList(rule));
}
}
</code>Flow diagram:
Implementation principle:
Token server centrally manages cluster traffic quota.
Gateway nodes request tokens from the token server in real time.
When total QPS exceeds the threshold, each node’s traffic is proportionally limited.
Avoids imbalance caused by single‑node rate limiting.
Second Defense Line: Device Fingerprinting and Behavior Analysis
1. Browser Fingerprint Generation
Frontend can generate a fingerprint in the browser; even if the IP changes, the same device yields the same fingerprint.
Implementation using Canvas and WebGL:
<code>// Frontend device fingerprint generation
function generateDeviceFingerprint() {
// 1. Collect basic device info
const baseInfo = [
navigator.userAgent,
navigator.platform,
screen.width + 'x' + screen.height,
navigator.language
].join('|');
// 2. Generate Canvas fingerprint
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
ctx.fillStyle = '#f60';
ctx.fillRect(0, 0, 100, 30);
ctx.fillStyle = '#069';
ctx.font = '16px Arial';
ctx.fillText('Defense is art', 10, 20);
const canvasData = canvas.toDataURL();
// 3. Generate WebGL fingerprint
const gl = canvas.getContext('webgl');
const debugInfo = gl.getExtension('WEBGL_debug_renderer_info');
const renderer = gl.getParameter(debugInfo.UNMASKED_RENDERER_WEBGL);
// 4. Combine into final fingerprint
const fingerprint = md5(baseInfo + canvasData + renderer);
return fingerprint;
}
</code>Fingerprint characteristics:
Stability : >98% consistency on the same device.
Uniqueness : <0.1% collision across different devices.
Stealth : Transparent to users, hard to clear.
2. Behavior Analysis Model
Analyze user behavior such as mouse movements.
Example Python model:
<code>import numpy as np
def analyze_mouse_behavior(move_events):
"""
Analyze mouse movement features.
:param move_events: list of {'x':..., 'y':..., 't':...}
:return: anomaly probability (0-1)
"""
# 1. Compute speed sequence
speeds = []
for i in range(1, len(move_events)):
prev = move_events[i-1]
curr = move_events[i]
dx = curr['x'] - prev['x']
dy = curr['y'] - prev['y']
distance = (dx**2 + dy**2) ** 0.5
time_diff = curr['t'] - prev['t']
speed = distance / max(0.001, time_diff)
speeds.append(speed)
# 2. Compute acceleration changes
accelerations = [speeds[i] - speeds[i-1] for i in range(1, len(speeds))]
# 3. Extract key features
features = {
'speed_mean': np.mean(speeds),
'speed_std': np.std(speeds),
'acc_max': max(accelerations),
'acc_std': np.std(accelerations),
'linearity': calc_linearity(move_events)
}
# 4. Predict with pretrained model
return risk_model.predict([features])
</code>Behavior feature dimensions:
Movement Speed : bots have constant speed, humans vary.
Acceleration : bots show saw‑tooth acceleration patterns.
Trajectory Linearity : bots tend to move in straight lines.
Operation Interval : bots have highly consistent intervals.
Third Defense Line: Dynamic Rule Engine
1. Real‑Time Rule Configuration
Use a dynamic rule engine such as Drools to define risk rules.
Drools rule example for high‑frequency access to a sensitive API:
<code>rule "High Frequency Coupon Acquisition"
salience 100
no-loop true
when
$req : Request(
path == "/api/coupon/acquire",
$uid : userId != null,
$ip : clientIp
)
// count requests from same user within 10 seconds
accumulate(
Request(
userId == $uid,
path == "/api/coupon/acquire",
this != $req,
$ts : timestamp
);
$count : count($ts),
$minTime : min($ts),
$maxTime : max($ts)
)
eval($count > 30 && ($maxTime - $minTime) < 10000)
then
insert(new BlockEvent($uid, $ip, "High Frequency Coupon"));
$req.setBlock(true);
end
</code>Rule engine advantages:
Real‑time effect : new rules push within seconds.
Complex conditions : supports multi‑dimensional joint judgments.
Dynamic updates : no service restart required.
2. Multi‑Dimensional Correlation Analysis Model
Illustrated below is a risk scoring mechanism that combines IP risk, device risk, behavior anomaly, and historical profile.
Scoring formula:
<code>RiskScore =
IPWeight * IPScore +
DeviceWeight * DeviceScore +
BehaviorWeight * AnomalyDegree +
HistoryWeight * HistoricalRisk
</code>Ultimate Defense Architecture
Summary diagram of the million‑QPS defense architecture:
Core component breakdown:
Traffic Scrubbing Layer (CDN)
Filters static resource requests.
Absorbs >70% of traffic spikes.
Security Layer (Gateway Cluster)
Device fingerprinting for each request.
Distributed rate limiting at cluster level.
Rule engine for real‑time risk judgment.
Real‑Time Risk Layer (Flink)
<code>// Flink real‑time risk processing
riskStream
.keyBy(req -> req.getDeviceId()) // group by device
.timeWindow(Time.seconds(10)) // 10‑second sliding window
.aggregate(new RiskAggregator) // aggregate risk metrics
.map(riskData -> {
val score = riskModel.predict(riskData);
if (score > RISK_THRESHOLD) {
// block high‑risk request
blockRequest(riskData.getRequestId());
}
})
</code>Data Support Layer
Redis stores real‑time risk profiles.
Flink computes behavior feature metrics.
Rule management console for dynamic strategy adjustments.
Hard‑Learned Lessons
1. The Trap of IP Whitelists
Scenario: adding partner IPs to a whitelist.
Disaster: attackers compromise partner servers and launch attacks.
Solution: validate requests with device fingerprinting and behavior analysis.
2. Static Rate‑Limit Threshold Pitfalls
Scenario: fixed 5,000 QPS limit.
Problem: legitimate traffic during promotions exceeds limit and gets blocked.
Optimization: dynamically adjust thresholds based on historical traffic.
<code>// Dynamic threshold adjustment algorithm
public class DynamicThreshold {
// Adjust based on last week’s same‑time traffic
public static int calculateThreshold(String api) {
// 1. Get historical QPS
double base = getHistoricalQps(api);
// 2. Apply today’s growth factor
double growth = getGrowthFactor();
// 3. Keep 20% safety margin
return (int)(base * growth * 0.8);
}
}
</code>3. Ignoring Bandwidth Costs
Disaster: 10 Gbps attack caused monthly budget to exceed by 200%.
Countermeasures:
Front‑end CDN to absorb static traffic.
Enable cloud provider DDoS protection services.
Configure bandwidth auto‑circuit‑breaker.
True defense is not about preventing attacks entirely, but making the attacker pay far more than they gain; when your defense cost is lower than the attack cost, the battle ends.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.