Boost API Speed 10× with a Three‑Level Cache Pyramid in Spring Boot 3
This article explains why adding Redis alone may still be slow, introduces a three‑level cache pyramid (Caffeine L1, Redis L2, DB L3) built with Spring Boot 3, and provides complete configuration, code, warm‑up, monitoring, and benchmark results that reduce response time from 28 ms to 2 ms while cutting CPU usage by 35%.
1. Why Redis alone can still be slow
Typical optimization steps are to cut database I/O with a cache, cut network I/O with a local cache, and cut serialization with zero‑copy. A remote Redis round‑trip costs 1‑2 ms, but under high concurrency the CPU context switch, serialization, and network jitter can amplify this to 5‑10 ms, whereas a local cache hit takes only tens of nanoseconds.
2. Three‑level cache pyramid
Using Spring Boot 3 we build a pyramid: L1 Caffeine (local) → L2 Redis (remote) → L3 Database . The solution includes back‑pressure, warm‑up, hot‑key handling, large‑key sharding, and requires no extra components – just copy‑paste and run.
3. Data‑heat distribution
In a single‑machine scenario with 10 k QPS, each 1 % increase in L1 hit rate reduces CPU usage by about 3 %.
4. Environment & dependencies (only three)
<!-- pom.xml -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>3.1.8</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>Run the application directly with java -jar without additional components.
5. Configuration: enable Caffeine and Redis together
spring:
cache:
type: caffeine # default uses L1
caffeine:
spec: maximumSize=10000,expireAfterWrite=60s
redis:
host: 127.0.0.1
port: 6379
timeout: 200ms
lettuce:
pool:
max-active: 646. Core encapsulation – three‑level cache template
@Component
@Slf4j
public class CacheTemplate<K, V> {
private final Cache<K, V> local = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(Duration.ofSeconds(60))
.recordStats()
.build();
@Autowired
private RedisTemplate<K, V> redisTemplate;
/**
* Pyramid lookup
*/
public V get(K key, Supplier<V> dbFallback) {
// L1 local
V v = local.getIfPresent(key);
if (v != null) {
log.debug("L1 hit {}", key);
return v;
}
// L2 Redis
v = redisTemplate.opsForValue().get(key);
if (v != null) {
local.put(key, v); // back‑fill L1
log.debug("L2 hit {}", key);
return v;
}
// L3 DB
v = dbFallback.get();
if (v != null) {
set(key, v); // write‑through
}
return v;
}
/**
* Write‑through (L1 + L2)
*/
public void set(K key, V value) {
local.put(key, value);
redisTemplate.opsForValue().set(key, value, Duration.ofMinutes(5));
}
/**
* Delete (L1 + L2)
*/
public void evict(K key) {
local.invalidate(key);
redisTemplate.delete(key);
}
@Scheduled(fixedDelay = 30_000)
public void printStats() {
log.info("L1 hitRate={}", local.stats().hitRate());
}
}7. Business usage – one‑line cache call
@RestController
@RequestMapping("/api/item")
@RequiredArgsConstructor
public class ItemController {
private final CacheTemplate<Long, ItemDTO> cache;
private final ItemRepository itemRepository;
@GetMapping("/{id}")
public ItemDTO getItem(@PathVariable Long id) {
return cache.get(id, () -> itemRepository.findById(id).orElse(null));
}
@PostMapping
public void create(@RequestBody ItemDTO dto) {
ItemDTO saved = itemRepository.save(dto);
cache.set(saved.getId(), saved);
}
@DeleteMapping("/{id}")
public void delete(@PathVariable Long id) {
itemRepository.deleteById(id);
cache.evict(id);
}
}8. Observed results
Log output after startup:
L1 hit 0.83
L2 hit 0.15
DB hit 0.02Response time dropped from 28 ms to 2 ms , and CPU usage decreased by about 35 % .
9. High‑concurrency pitfalls (four common issues)
The article lists four typical problems that appear under high load, such as cache stampede, large keys, TTL expiration spikes, and insufficient capacity, and provides mitigation strategies.
10. Local warm‑up & back‑pressure
At application start, hot keys are pre‑loaded asynchronously to avoid cold‑cache penetration:
@EventListener(ApplicationReadyEvent.class)
public void warm() {
List<Long> hotIds = itemRepository.findHotIds(PageRequest.of(0, 200));
hotIds.parallelStream().forEach(id ->
cache.set(id, itemRepository.findById(id).orElse(null)));
}Parallel streams control concurrency using the default ForkJoinPool.commonPool().
11. Benchmark results
Environment: Mac M2 8 GB, 4 concurrent threads, 60 s
Tool: wrk2 -R 5000 -d 60s -c 50 Benchmark image:
12. Monitoring & alerting
Caffeine provides built‑in statistics; Micrometer exports them to Prometheus:
MeterBinder caffeineMetrics = registry ->
CaffeineMetrics.monitor(registry, local, "l1_cache");Grafana panels to watch: l1_cache_hit_rate < 70% → alert l1_cache_eviction_count spikes → capacity issue Redis keyspace_hits / (hits+misses) < 50% → large key or cache penetration
13. Extension – multi‑cache annotation
Spring Cache natively supports a single cache; a custom MultiCacheable annotation enables layered caching:
@Target(METHOD)
@Retention(RUNTIME)
public @interface MultiCacheable {
String[] cacheNames(); // {"l1", "l2"}
String key();
}An AOP interceptor processes caches in order L1 → L2 → DB, keeping business code untouched.
14. Conclusion
The pyramid model separates data by heat, placing hot data in L1, warm data in L2, and cold data in the DB.
Back‑pressure and random TTL prevent cache avalanche.
Warm‑up and comprehensive monitoring make the system observable.
When all three steps are applied, API latency can improve tenfold.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
