Master Multi‑Dimensional Bandwidth Throttling in Spring Boot 3 with Token Bucket
This guide explains how to implement precise, multi‑dimensional network bandwidth throttling in Spring Boot 3 using a manually coded token‑bucket algorithm, HandlerInterceptor, HttpServletResponseWrapper, and RateLimitedOutputStream to control download, video streaming, and API traffic.
Overview
This article presents a complete solution for multi‑dimensional network bandwidth throttling in Spring Boot 3. It manually implements the token‑bucket algorithm and integrates it via a custom HandlerInterceptor, an HttpServletResponseWrapper, and a RateLimitedOutputStream to precisely control output speed for file downloads, video streams, and other scenarios.
Why Bandwidth Throttling Is Needed
Unlike typical API rate limiting, which restricts request count (e.g., 100 requests per minute), bandwidth throttling limits the amount of data transferred (e.g., 200 KB/s). It is valuable for:
File download services : Free users limited to 200 KB/s, VIP users to 2 MB/s.
Video streaming : Different resolutions map to different bandwidth caps (480P → 500 KB/s, 1080P → 3 MB/s).
API protection : Large‑payload endpoints (e.g., report export) can consume the entire outbound bandwidth if not limited.
Core Principle: Token Bucket Algorithm
The token‑bucket algorithm visualizes a bucket that receives tokens at a fixed rate; each byte of data consumes a token. When the bucket is empty, the sender must wait.
Key Parameters
Bucket Capacity (Capacity) : Maximum burst size. A capacity of 200 KB allows at most 200 KB of data to be sent instantly before waiting for new tokens.
Refill Rate (Refill Rate) : Long‑term average speed. Adding 200 KB of tokens per second yields an average of 200 KB/s.
Chunk Size (Chunk Size) : Size of each write operation; smaller chunks produce smoother throttling.
Algorithm Flow
发送数据前:
1. 计算距离上次补充的时间差
2. 根据 时间差 × 填充速率 计算新增令牌数
3. 更新桶中令牌数(不超过容量上限)
发送数据时:
1. 检查令牌是否足够
2. 足够:直接扣除令牌,发送数据
3. 不足:计算 (缺少令牌数 / 填充速率) 得到等待时间,精确等待后发送Technical Design
Overall Flow
请求流程:
┌─────────────────────────────────────┐
│ 1. DispatcherServlet 分发请求 │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ 2. BandwidthLimitInterceptor.preHandle()│
│ - 解析 @BandwidthLimit 注解 │
│ - 从 BandwidthLimitManager 获取共享 TokenBucket │
│ - 创建 BandwidthLimitResponseWrapper 并存入 request attribute │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ 3. Controller 处理请求 │
│ - 通过 BandwidthLimitHelper.getLimitedResponse() 获取包装后的响应 │
│ - 向响应流写入数据(自动触发限速) │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ 4. BandwidthLimitInterceptor.afterCompletion() │
│ - 清理资源,关闭流 │
└─────────────────────────────────────┘Why Choose HandlerInterceptor
Spring MVC provides two interception mechanisms: Filter and HandlerInterceptor. The annotation @BandwidthLimit is only available after the HandlerMethod is resolved, which happens in the interceptor stage but not in a filter. Therefore, HandlerInterceptor can accurately read method‑level annotations and apply per‑method throttling.
Core Component Responsibilities
@BandwidthLimit: Declarative annotation that configures limit parameters. BandwidthLimitInterceptor: Intercepts the request, parses the annotation, and creates the response wrapper. BandwidthLimitManager: Manages token buckets for all dimensions (global/API/user/IP). BandwidthLimitResponseWrapper: Wraps HttpServletResponse and replaces the output stream. RateLimitedOutputStream: Implements throttling logic using the shared TokenBucket. TokenBucket: Core token‑bucket algorithm implementation. BandwidthLimitHelper: Retrieves the wrapped response from request attributes.
Multi‑Dimensional Throttling Implementation
Global Limit (GLOBAL)
All requests share a single bucket, protecting the server’s total outbound bandwidth. Example: limit the whole service to 10 MB/s regardless of concurrent downloads.
@BandwidthLimit(value = 200, unit = BandwidthUnit.KB, type = LimitType.GLOBAL)
@GetMapping("/download/global")
public void downloadGlobal(HttpServletResponse response) throws IOException {
HttpServletResponse limitedResponse = BandwidthLimitHelper.getLimitedResponse(request, response);
// write data ...
}API Dimension (API)
Each API path has an independent bucket, so different endpoints do not affect each other.
@BandwidthLimit(value = 500, unit = BandwidthUnit.KB, type = LimitType.API)
@GetMapping("/download/file")
public void downloadFile(HttpServletResponse response) throws IOException {
// file download logic
}
@BandwidthLimit(value = 2048, unit = BandwidthUnit.KB, type = LimitType.API)
@GetMapping("/stream/video")
public void streamVideo(HttpServletResponse response) throws IOException {
// video streaming logic
}User Dimension (USER)
Limits are applied per user identifier (e.g., request header X-User-Id). The free and vip parameters enable differentiated service levels.
@BandwidthLimit(value = 200, unit = BandwidthUnit.KB, type = LimitType.USER, free = 200, vip = 2048)
@GetMapping("/download/user")
public void downloadByUser(@RequestHeader("X-User-Type") String userType,
HttpServletResponse response) throws IOException {
// automatically apply 200KB/s for free users or 2MB/s for VIPs
}IP Dimension (IP)
Limits traffic per client IP address, preventing a single IP from monopolizing bandwidth. Supports proxy headers such as X-Forwarded-For and X-Real-IP.
@BandwidthLimit(value = 300, unit = BandwidthUnit.KB, type = LimitType.IP)
@GetMapping("/download/ip")
public void downloadByIp(HttpServletResponse response) throws IOException {
// each distinct IP limited to 300KB/s
}Key Code Implementations
Token Bucket Core Algorithm
public synchronized void acquire(long permits) {
// 1. Refill tokens based on elapsed time
refill();
if (tokens >= permits) {
tokens -= permits;
return;
}
long deficit = permits - tokens;
long waitNanos = (deficit * 1_000_000_000L) / refillRate;
// 3. Precise wait
sleepNanos(waitNanos);
// 4. Consume after waiting
tokens = 0;
}
private void refill() {
long now = System.nanoTime();
long elapsedNanos = now - lastRefillTime;
long newTokens = (elapsedNanos * refillRate) / 1_000_000_000L;
tokens = Math.min(capacity, tokens + newTokens);
lastRefillTime = now;
}Response Wrapper
public class BandwidthLimitResponseWrapper extends HttpServletResponseWrapper {
private final TokenBucket sharedTokenBucket;
@Override
public ServletOutputStream getOutputStream() throws IOException {
if (limitedOutputStream == null && sharedTokenBucket != null) {
limitedOutputStream = new RateLimitedOutputStream(
super.getOutputStream(),
sharedTokenBucket,
bandwidthBytesPerSecond);
}
return limitedOutputStream;
}
}Interceptor Creating Wrapped Response
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
BandwidthLimit annotation = findAnnotation(handler);
if (annotation != null) {
TokenBucket bucket = limitManager.getBucket(type, key, capacity, rate);
BandwidthLimitResponseWrapper wrappedResponse = new BandwidthLimitResponseWrapper(
response, bucket, bandwidthBytesPerSecond, chunkSize);
request.setAttribute("BandwidthLimitWrappedResponse", wrappedResponse);
}
return true;
}Controller Using Limited Response
@GetMapping("/download/global")
public void downloadGlobal(HttpServletRequest request, HttpServletResponse response) throws IOException {
HttpServletResponse limitedResponse = BandwidthLimitHelper.getLimitedResponse(request, response);
limitedResponse.setContentType("application/octet-stream");
limitedResponse.setHeader("Content-Disposition", "attachment; filename=test.bin");
limitedResponse.getOutputStream().write(data);
}Parameter Tuning Guide
Bucket Capacity Selection
Capacity determines how much burst traffic can be absorbed. Recommended settings:
Rate × 0.5 → strict control, no burst.
Rate × 1.0 → default, allows a 1‑second burst.
Rate × 2.0 → permits a 2‑second burst, useful for fast first‑screen loads.
Chunk Size Selection
Chunk size affects smoothness. A practical formula is chunkSize = bandwidth / 50. Example recommendations:
200 KB/s → 1‑4 KB chunks.
1 MB/s → 4‑8 KB chunks.
>5 MB/s → 8‑16 KB chunks.
Automatic vs Manual Chunk Size
// automatic (recommended)
@BandwidthLimit(value = 200, unit = BandwidthUnit.KB, chunkSize = -1)
// manual specification
@BandwidthLimit(value = 200, unit = BandwidthUnit.KB, chunkSize = 4096)Conclusion
The article demonstrates a complete, production‑ready solution for multi‑dimensional bandwidth throttling in Spring Boot 3, built on the token‑bucket algorithm, HandlerInterceptor, and response wrapping. It supports global, API, user, and IP dimensions and provides concrete tuning advice for bucket capacity and chunk size.
Source code:
https://github.com/yuboon/java-examples/tree/master/springboot-netspeed-limitJava Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
