What QPS Defines High Concurrency? Strategies & Architecture Explained
This article defines high concurrency, outlines QPS thresholds for small, medium, high, and ultra‑high traffic, and presents practical solutions such as multi‑level caching, load balancing, database sharding, and message‑queue traffic shaping to build robust backend systems.
What is High Concurrency
High concurrency refers to a system's ability to handle a large number of user requests within a unit of time, typically measured by QPS (queries per second).
Typical high‑concurrency scenarios depend on business needs, e.g., Alibaba's Double 11 peak of 583,000 orders per second.
QPS Thresholds for High Concurrency
QPS is used to judge concurrency level. It can be divided into:
Small scale : QPS < 100 – low concurrency.
Medium scale : 100 ≤ QPS ≤ 1,000 – requires some optimization and distributed architecture.
High concurrency : 1,000 ≤ QPS ≤ 10,000 – needs caching, asynchronous processing, database sharding, etc.
Ultra‑high concurrency : QPS > 10,000, even >100,000 – typical for large internet platforms.
In practice, QPS over 1,000 is considered high concurrency; over 10,000 is ultra‑high.
Multi‑Level Caching
Cache reduces direct database access, speeding up responses.
Local cache : In‑process memory cache.
Distributed cache : Redis, Memcached, etc., shared across services.
CDN cache : Global nodes cache static resources.
Load Balancing
Distributes user requests across multiple servers to relieve pressure on any single server.
Hardware load balancers : High‑performance devices (e.g., F5, A10) for extreme traffic.
Software load balancers : Open‑source solutions such as Nginx, HAProxy, Apache.
Database Sharding (分库分表)
Splits data across multiple databases or tables to reduce load on a single database.
Vertical sharding : Separate databases by business module (e.g., users vs. orders).
Horizontal sharding : Partition tables by range (e.g., user ID, order ID).
Message Middleware for Traffic Shaping
Message queues such as Kafka, RabbitMQ, RocketMQ can decouple services and smooth traffic spikes.
These architectural techniques are often combined to handle high‑concurrency scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
