How to Define and Tackle High Concurrency: Strategies and Code Samples
This article explains what constitutes high concurrency, categorizes load levels, and presents practical solutions such as load balancing, database sharding, query optimization, caching, message queues, and rate‑limiting, complete with code examples for implementing these techniques in backend systems.
High concurrency refers to a system's ability to handle a large number of simultaneous requests, a common challenge in e‑commerce, finance, social media, and streaming services. Scenarios like Alibaba's Double‑11 can involve hundreds of thousands of orders per second.
Concurrency levels are typically classified as:
Low concurrency: fewer than 100 simultaneous connections.
Medium concurrency: about 100‑1,000 connections.
High concurrency: 1,000‑10,000 connections.
Ultra‑high concurrency: over 10,000 connections.
Extreme concurrency: millions of connections, as seen in large social platforms.
Most enterprises need to address tens of thousands to millions of concurrent connections.
Key techniques for handling high concurrency include load balancing, database sharding, query optimization, caching, message queues, and rate‑limiting.
Load Balancing
Distributed load balancers such as Nginx, HAProxy, or LVS spread requests across multiple servers, enabling dynamic scaling and automatic failover. Common algorithms are:
Round‑robin: sequentially assign requests to back‑end servers.
Random: pick a server at random.
Weighted round‑robin: assign weights based on server capacity.
Least connections: target the server with the fewest active connections.
IP hash: route requests from the same client IP to the same server.
Database Sharding
To avoid database bottlenecks, horizontal sharding splits tables across multiple databases, often using a hash of a key such as user_id. Example SQL statements:
CREATE TABLE orders_0 ( ... );
CREATE TABLE orders_1 ( ... );
CREATE TABLE orders_2 ( ... );Routing logic might store a user's orders in orders_{user_id % 3}. In Java, frameworks like Sharding‑JDBC simplify configuration:
@Configuration
@EnableSharding
public class ShardingConfig {
@Bean
public ShardingDataSource shardingDataSource() {
// configure data sources and sharding rules, e.g., hash routing by user_id
}
}Database Optimization
Avoid complex joins and full‑table scans; use appropriate indexes, break queries into simpler parts, and replace sub‑queries with joins. Batch inserts/updates reduce overhead, and external caches such as Redis or Memcached offload frequent reads.
Message Queues
Message queues (e.g., Kafka, RocketMQ) buffer spikes in traffic, decoupling request ingestion from processing. For example, an e‑commerce payment service can enqueue payment requests and process them at a controlled rate, preventing overload.
Rate Limiting and Degradation
When system load is high, rate‑limiting algorithms throttle request rates, while degradation temporarily disables or lowers priority of non‑essential features, ensuring core functionality remains responsive.
These strategies collectively improve performance, stability, and availability of high‑concurrency systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
