Backend Development 7 min read

Designing a System That Can Survive Sudden Spikes of One Million QPS

The article analyzes why simply adding Redis nodes cannot handle a sudden million‑QPS surge, then presents three practical solutions—key sharding, multi‑level caching with hot‑key detection, and distributed‑lock‑based fallback—to build a resilient high‑concurrency backend.

Lobster Programming

May 25, 2026

Designing a System That Can Survive Sudden Spikes of One Million QPS

When a news event such as a celebrity breakup goes viral, the associated service can experience an instantaneous surge of up to one million queries per second (QPS). The article examines how to architect a backend that can absorb such spikes without collapsing.

1. Conventional thinking – adding more Redis nodes

Redis is a core component in high‑concurrency systems, and a single Redis instance can handle roughly 100 k QPS. The naive idea is to horizontally scale the Redis cluster to 20 machines. However, because Redis cluster routes a given key to a fixed shard, all traffic for a hot key is still directed to a single node, leaving the other nodes idle. When that node cannot sustain the million QPS, it crashes, and the overload immediately propagates to the downstream MySQL, causing a total system failure.

2. Data‑sharding solution

Since a hot key is bound to a single shard, the article proposes splitting the hot key into many smaller keys, e.g., hot_key_1 … hot_key_100. The client appends a random number (1‑100) to the original key, so each request is mapped to a different shard and the million QPS is evenly distributed across the whole cluster. This approach requires careful pre‑design to ensure the 100 sub‑keys are uniformly placed on the Redis nodes, and it introduces consistency challenges because an update to the logical hot key may need to modify all 100 physical keys.

3. Multi‑level cache solution

A mature design avoids sending all traffic to Redis by introducing a multi‑level cache. The first level is a local in‑process cache (e.g., Caffeine). When a request arrives, the service first checks the local cache; a hit returns the data immediately, shielding Redis from the bulk of the load. Only a small fraction of requests miss the local cache and fall through to Redis.

Because sudden spikes are unpredictable, large‑scale systems deploy a hot‑key detection framework. If a key receives more than a threshold (e.g., 1 000 accesses within one second), the system marks it as hot, pushes the key via a message queue (MQ) to all service nodes, and pre‑loads it into the local cache. This proactive propagation prevents the hot key from overwhelming Redis.

Local caching introduces consistency concerns. Common remedies are a very short TTL (e.g., three seconds) so stale data expires quickly, or broadcasting an invalidation message through MQ. During cache rebuild, a distributed lock ensures that only one thread queries the database while others wait or receive a degraded response, preventing a thundering‑herd effect on MySQL.

For non‑core traffic, a global circuit‑breaker (e.g., Sentinel) can downgrade or reject requests, guaranteeing that core traffic receives priority processing under extreme load.

Summary

Adding more Redis nodes does not help when a hot key is fixed to a single shard.

A hot‑key detection mechanism is required to quickly identify and isolate spikes.

Multi‑level caching offloads the majority of traffic to local memory, reducing Redis pressure.

Distributed locks protect the database during cache miss storms.

Circuit‑breaker degradation prioritizes core requests over non‑core ones.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Redis High concurrency Distributed Lock circuit breaker Multi-level Cache Hot Key Detection Cache Sharding

Written by

Lobster Programming

Sharing insights on technical analysis and exchange, making life better through technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.