How to Make Spring Cloud Gateway Handle a Million Concurrent Requests

This article explains how Spring Cloud Gateway leverages a reactive, non‑blocking architecture, OS‑level tuning, zero‑copy networking, and built‑in rate‑limiting and circuit‑breaker features to reliably sustain million‑level concurrent traffic in production environments.

Architect Chen
Architect Chen
Architect Chen
How to Make Spring Cloud Gateway Handle a Million Concurrent Requests

In high‑traffic scenarios a gateway is critical; Spring Cloud Gateway (SCG) is built on Spring WebFlux and Reactor Netty, making it naturally suited for massive concurrent connections because it avoids the traditional "one request per thread" model.

Asynchronous Non‑Blocking Model

SCG processes routing, filter chains, and response forwarding within a reactive pipeline, so threads are not blocked by I/O. This eliminates thread exhaustion under heavy load, allowing a single node to handle many more connections.

server:
  netty:
    threads: 16               # EventLoop threads (recommended CPU cores * 2)
    max-connections: 1000000  # Maximum connections, supports million‑level concurrency
    idle-timeout: 30s         # Idle timeout
spring:
  cloud:
    gateway:
      httpclient:
        max-connections: 100000   # Backend HTTP connection pool
        idle-timeout: 30s

High‑Performance Core Techniques

Beyond the framework, OS tuning is essential. Techniques such as zero‑copy (using Netty's CompositeByteBuf) keep data in kernel space, avoiding repeated user‑kernel copies and reducing CPU load. Adjusting system parameters like ulimit (max file handles) and TCP settings (e.g., somaxconn) ensures the network stack can accept the massive number of connections.

Zero‑Copy Illustration
Zero‑Copy Illustration

Rate Limiting (防雪崩)

In flash‑sale or spike scenarios, ingress rate limiting is mandatory to protect downstream services and databases. SCG integrates RequestRateLimiter (token‑bucket/leaky‑bucket) and Resilience4j circuit breakers out of the box.

Rate Limiting Diagram
Rate Limiting Diagram
spring:
  cloud:
    gateway:
      routes:
        - id: my_route
          uri: lb://my-service
          predicates:
            - Path=/api/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 10   # 10 requests per second
                redis-rate-limiter.burstCapacity: 20   # allow bursts of 20
                name: myCircuitBreaker
                args:
                  fallbackUri: forward:/fallback

The rate limiter, backed by Redis, can be configured per IP, user, or API to precisely control traffic bursts, while the circuit breaker automatically trips on downstream timeouts or high error rates, returning a fallback response.

Conclusion

Combining SCG's reactive, non‑blocking core, OS‑level optimizations, zero‑copy networking, and built‑in rate‑limiting and circuit‑breaker capabilities provides a robust solution that can sustain million‑level concurrent requests without degrading stability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

nettyPerformance TuningHigh Concurrencyreactiverate limitingSpring Cloud Gateway
Architect Chen
Written by

Architect Chen

Sharing over a decade of architecture experience from Baidu, Alibaba, and Tencent.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.