Backend Development 10 min read

Can Spring Boot Handle 500k QPS? A Proven Architecture Demonstrates the Answer

The article debunks the myth that Spring Boot cannot scale, showing that with WebFlux, Netty, Reactive Redis, GraalVM Native Image and horizontal scaling a Spring Boot service can reliably achieve 500,000 queries per second, backed by concrete benchmark data and a full demo implementation.

LuTiao Programming

Dec 26, 2025

Can Spring Boot Handle 500k QPS? A Proven Architecture Demonstrates the Answer

Two long‑standing extreme views about Spring Boot exist: it is only for CRUD and cannot handle high concurrency, or it can magically reach millions of QPS with a few tweaks. Both are false. The default Spring Boot configuration cannot sustain 500 k QPS, but a properly designed architecture can.

Many modern systems—AI inference APIs, real‑time risk control, IoT event reporting, edge data aggregation, global SaaS API gateways—share a common demand: massive request volume, lightweight logic, and latency sensitivity. This workload matches the strengths of WebFlux combined with Redis.

Direct answer to “Can Spring Boot reach 500 k QPS?”: Tomcat + MVC is impossible; blocking I/O + JDBC has a low ceiling; the default thread model leads to GC failures. By switching to Spring WebFlux, Netty (non‑blocking I/O), Project Reactor, Redis/Kafka, GraalVM Native Image and horizontal scaling, Spring Boot can become the core of a high‑concurrency system.

Step 1: Replace Tomcat with WebFlux + Netty

Tomcat allocates one thread per request, causing massive context‑switch overhead under high load. Netty’s event‑driven, non‑blocking architecture uses far fewer threads, achieves higher CPU utilization, and supports many concurrent connections.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-webflux</artifactId>
</dependency>

Step 2: Eliminate All Blocking Points

In a Netty‑based system, any thread blockage halves the overall QPS. The article lists typical blockers and their reactive replacements:

JDBC / JPA → R2DBC

File I/O → Reactive file APIs

RestTemplate → WebClient

Thread.sleep, synchronized → non‑blocking reactive patterns

Step 3: Let the Database Handle Only Asynchronous Writes

At 500 k QPS the database becomes the bottleneck; Redis should serve as the first entry point. The database is used for async batch writes with eventual consistency.

Demo Project Structure

/usr/local/app/high-qps-demo
├── src/main/java
│   └── com/icoderoad
│       ├── Application.java
│       ├── config/RedisConfig.java
│       ├── api/CacheController.java
│       └── service/CacheService.java
└── src/main/resources/application.yml

Key Code Snippets

Redis Reactive Dependency

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>

Reactive Redis Configuration

package com.icoderoad.config;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.ReactiveRedisConnectionFactory;
import org.springframework.data.redis.core.ReactiveStringRedisTemplate;

@Configuration
public class RedisConfig {
    @Bean
    public ReactiveStringRedisTemplate reactiveRedisTemplate(ReactiveRedisConnectionFactory factory) {
        return new ReactiveStringRedisTemplate(factory);
    }
}

Cache Service (non‑blocking)

package com.icoderoad.service;

import org.springframework.data.redis.core.ReactiveStringRedisTemplate;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Mono;

@Service
public class CacheService {
    private final ReactiveStringRedisTemplate redisTemplate;
    public CacheService(ReactiveStringRedisTemplate redisTemplate) {
        this.redisTemplate = redisTemplate;
    }
    public Mono<String> getValue(String key) {
        return redisTemplate.opsForValue().get(key).defaultIfEmpty("EMPTY");
    }
    public Mono<Boolean> setValue(String key, String value) {
        return redisTemplate.opsForValue().set(key, value);
    }
}

WebFlux Controller (stateless)

package com.icoderoad.api;

import com.icoderoad.service.CacheService;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;

@RestController
public class CacheController {
    private final CacheService cacheService;
    public CacheController(CacheService cacheService) {
        this.cacheService = cacheService;
    }
    @GetMapping("/cache/{key}")
    public Mono<String> get(@PathVariable String key) {
        return cacheService.getValue(key);
    }
}

application.yml (essential parameters)

server:
  port: 8080
spring:
  redis:
    host: 127.0.0.1
    port: 6379
    timeout: 2s
logging:
  level:
    root: ERROR

Performance Results

Single‑instance (non‑Native) achieves 80–120 k QPS with P99 latency under 3 ms. Using GraalVM Native Image and scaling to 10 instances at 50 k QPS each yields roughly 500 k QPS.

👉 10 instances × 50 k QPS ≈ 500 k QPS

Internal Communication

For inter‑service calls, the article recommends gRPC + HTTP/2 over REST, and HTTP/3/QUIC at the edge, delivering lower latency, higher throughput and reduced CPU consumption.

Horizontal Scaling

No single Java service can sustain 500 k QPS alone. Scaling patterns such as 5 × 100 k or 10 × 50 k instances are straightforward in a cloud‑native environment.

Native Image as a QPS Amplifier

Spring Boot 3 + GraalVM provides millisecond‑level startup, >60 % memory reduction, and a 5–10× QPS boost. System‑level tuning (e.g., net.core.somaxconn=65535, net.ipv4.ip_local_port_range="10000 65535", net.core.netdev_max_backlog=4096) is essential; otherwise the OS rejects connections before the service reaches its limit.

Real‑World Deployable Architecture

Edge
 └─ Cloudflare / Fastly
LB
 └─ NGINX / Envoy
App
 └─ 10 × Spring Boot WebFlux (Native)
Data
 ├─ Redis Cluster
 ├─ Kafka
 └─ Async DB

What Kills Performance

Tomcat

JDBC

RestTemplate

Heavy business logic

Global locks

Massive object creation

Conclusion

Spring Boot can power high‑concurrency systems if you adopt WebFlux + Netty, place Redis at the front, keep the entire call chain non‑blocking, leverage GraalVM Native Image, and scale horizontally. The default Tomcat‑based stack, blocking I/O, and monolithic design remain the real bottlenecks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Netty WebFlux Native Image horizontal scaling QPS spring-boot high-concurrency reactive-redis

Written by

LuTiao Programming

LuTiao Programming is a friendly community offering free programming lessons. We inspire learners to explore new ideas and technologies and quickly acquire job-ready skills.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Step 1: Replace Tomcat with WebFlux + Netty

Step 2: Eliminate All Blocking Points

Step 3: Let the Database Handle Only Asynchronous Writes

Demo Project Structure

Key Code Snippets

Performance Results

Internal Communication

Horizontal Scaling

Native Image as a QPS Amplifier

Real‑World Deployable Architecture

What Kills Performance

Conclusion

LuTiao Programming

How this landed with the community

Was this worth your time?

0 Comments

Step 1: Replace Tomcat with WebFlux + Netty

Step 2: Eliminate All Blocking Points

Step 3: Let the Database Handle Only Asynchronous Writes