Can Spring Boot Handle 500k QPS? A Proven Architecture Demonstrates the Answer
The article debunks the myth that Spring Boot cannot scale, showing that with WebFlux, Netty, Reactive Redis, GraalVM Native Image and horizontal scaling a Spring Boot service can reliably achieve 500,000 queries per second, backed by concrete benchmark data and a full demo implementation.
Two long‑standing extreme views about Spring Boot exist: it is only for CRUD and cannot handle high concurrency, or it can magically reach millions of QPS with a few tweaks. Both are false. The default Spring Boot configuration cannot sustain 500 k QPS, but a properly designed architecture can.
Many modern systems—AI inference APIs, real‑time risk control, IoT event reporting, edge data aggregation, global SaaS API gateways—share a common demand: massive request volume, lightweight logic, and latency sensitivity. This workload matches the strengths of WebFlux combined with Redis.
Direct answer to “Can Spring Boot reach 500 k QPS?”: Tomcat + MVC is impossible; blocking I/O + JDBC has a low ceiling; the default thread model leads to GC failures. By switching to Spring WebFlux, Netty (non‑blocking I/O), Project Reactor, Redis/Kafka, GraalVM Native Image and horizontal scaling, Spring Boot can become the core of a high‑concurrency system.
Step 1: Replace Tomcat with WebFlux + Netty
Tomcat allocates one thread per request, causing massive context‑switch overhead under high load. Netty’s event‑driven, non‑blocking architecture uses far fewer threads, achieves higher CPU utilization, and supports many concurrent connections.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>Step 2: Eliminate All Blocking Points
In a Netty‑based system, any thread blockage halves the overall QPS. The article lists typical blockers and their reactive replacements:
JDBC / JPA → R2DBC
File I/O → Reactive file APIs
RestTemplate → WebClient
Thread.sleep, synchronized → non‑blocking reactive patterns
Step 3: Let the Database Handle Only Asynchronous Writes
At 500 k QPS the database becomes the bottleneck; Redis should serve as the first entry point. The database is used for async batch writes with eventual consistency.
Demo Project Structure
/usr/local/app/high-qps-demo
├── src/main/java
│ └── com/icoderoad
│ ├── Application.java
│ ├── config/RedisConfig.java
│ ├── api/CacheController.java
│ └── service/CacheService.java
└── src/main/resources/application.ymlKey Code Snippets
Redis Reactive Dependency
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>Reactive Redis Configuration
package com.icoderoad.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.ReactiveRedisConnectionFactory;
import org.springframework.data.redis.core.ReactiveStringRedisTemplate;
@Configuration
public class RedisConfig {
@Bean
public ReactiveStringRedisTemplate reactiveRedisTemplate(ReactiveRedisConnectionFactory factory) {
return new ReactiveStringRedisTemplate(factory);
}
}Cache Service (non‑blocking)
package com.icoderoad.service;
import org.springframework.data.redis.core.ReactiveStringRedisTemplate;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Mono;
@Service
public class CacheService {
private final ReactiveStringRedisTemplate redisTemplate;
public CacheService(ReactiveStringRedisTemplate redisTemplate) {
this.redisTemplate = redisTemplate;
}
public Mono<String> getValue(String key) {
return redisTemplate.opsForValue().get(key).defaultIfEmpty("EMPTY");
}
public Mono<Boolean> setValue(String key, String value) {
return redisTemplate.opsForValue().set(key, value);
}
}WebFlux Controller (stateless)
package com.icoderoad.api;
import com.icoderoad.service.CacheService;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;
@RestController
public class CacheController {
private final CacheService cacheService;
public CacheController(CacheService cacheService) {
this.cacheService = cacheService;
}
@GetMapping("/cache/{key}")
public Mono<String> get(@PathVariable String key) {
return cacheService.getValue(key);
}
}application.yml (essential parameters)
server:
port: 8080
spring:
redis:
host: 127.0.0.1
port: 6379
timeout: 2s
logging:
level:
root: ERRORPerformance Results
Single‑instance (non‑Native) achieves 80–120 k QPS with P99 latency under 3 ms. Using GraalVM Native Image and scaling to 10 instances at 50 k QPS each yields roughly 500 k QPS.
👉 10 instances × 50 k QPS ≈ 500 k QPS
Internal Communication
For inter‑service calls, the article recommends gRPC + HTTP/2 over REST, and HTTP/3/QUIC at the edge, delivering lower latency, higher throughput and reduced CPU consumption.
Horizontal Scaling
No single Java service can sustain 500 k QPS alone. Scaling patterns such as 5 × 100 k or 10 × 50 k instances are straightforward in a cloud‑native environment.
Native Image as a QPS Amplifier
Spring Boot 3 + GraalVM provides millisecond‑level startup, >60 % memory reduction, and a 5–10× QPS boost. System‑level tuning (e.g., net.core.somaxconn=65535, net.ipv4.ip_local_port_range="10000 65535", net.core.netdev_max_backlog=4096) is essential; otherwise the OS rejects connections before the service reaches its limit.
Real‑World Deployable Architecture
Edge
└─ Cloudflare / Fastly
LB
└─ NGINX / Envoy
App
└─ 10 × Spring Boot WebFlux (Native)
Data
├─ Redis Cluster
├─ Kafka
└─ Async DBWhat Kills Performance
Tomcat
JDBC
RestTemplate
Heavy business logic
Global locks
Massive object creation
Conclusion
Spring Boot can power high‑concurrency systems if you adopt WebFlux + Netty, place Redis at the front, keep the entire call chain non‑blocking, leverage GraalVM Native Image, and scale horizontally. The default Tomcat‑based stack, blocking I/O, and monolithic design remain the real bottlenecks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
LuTiao Programming
LuTiao Programming is a friendly community offering free programming lessons. We inspire learners to explore new ideas and technologies and quickly acquire job-ready skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
