Backend Development 20 min read

Preventing and Recovering from Service Overload Caused by Cache Failures

This article analyzes how introducing caches can cause service overload, examines five cache get patterns, and proposes prevention, recovery, and flow‑control strategies for both client and server sides to ensure system stability.

Architecture Digest
Architecture Digest
Architecture Digest
Preventing and Recovering from Service Overload Caused by Cache Failures

Cache is widely used in modern systems, but its introduction can create hidden overload risks when external request traffic spikes, leading to request queuing, service unavailability, and eventual system crash. The article presents a case where a client system (A) heavily relies on a cache before calling a server system (B), and describes three primary overload causes: proxy failures, cache failures, and cache‑miss recovery storms.

Five cache‑get handling patterns are discussed: (1) simple timeout mode, (2) conventional timeout mode, (3) simple refresh mode, (4) conventional refresh mode, and (5) refresh‑renewal mode. The differences lie in whether each thread triggers a remote fetch, whether threads wait for an ongoing fetch, and whether stale values are returned during refresh.

For the client side, the article recommends using the conventional timeout or refresh modes—especially the asynchronous refresh‑renewal mode—to reduce the chance of overload when the cache becomes unavailable or stale. It also outlines strategies for handling distributed cache outages, such as logging and using default values, probabilistic request forwarding, or querying a health‑check interface on the server.

On the server side, two main overload‑protection techniques are presented: flow control (based on traffic thresholds or host health) and service degradation (disabling non‑critical APIs). Flow control can be implemented at reverse proxies (e.g., Nginx) or via service‑governance systems, while degradation should be combined with other measures because overload often exceeds capacity by an order of magnitude.

Additional recommendations include dynamic scaling, gradual traffic ramp‑up after a crash, and careful monitoring of cache key lifetimes to avoid memory‑leak issues in refresh‑based caches. The conclusion emphasizes adopting asynchronous refresh‑renewal caching on the client, probabilistic fallback when the cache is down, and proper flow‑control thresholds on the server to prevent and mitigate overload.

backendDistributed SystemsCacheresilienceFlow Controlservice overload
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.