Backend Development 13 min read

15 Proven Strategies to Design High‑Concurrency Systems

This article outlines fifteen practical techniques—including horizontal scaling, microservice decomposition, database sharding, connection pooling, caching, CDN, message queues, Elasticsearch, circuit breaking, rate limiting, and load testing—to help engineers build robust, high‑concurrency systems that can handle massive traffic spikes.

dbaplus Community

Feb 25, 2023

15 Proven Strategies to Design High‑Concurrency Systems

Overview

Designing a high‑concurrency system means guaranteeing overall availability while handling a large number of simultaneous user requests and sudden traffic spikes. The following fifteen techniques are frequently discussed in technical interviews and constitute a practical checklist for building scalable, fault‑tolerant services.

1. Horizontal Scaling (Divide and Conquer)

Deploy multiple stateless instances behind a load balancer (e.g., Nginx, L4 LB). Each node processes a fraction of the traffic, eliminating the single‑point‑of‑failure of a single‑machine deployment and increasing aggregate request‑handling capacity.

2. Microservice Decomposition

Split a monolithic application into independent services based on business domains (e.g., user, order, product). Each service runs in its own process/container, allowing independent scaling, isolated failures, and clearer ownership of resources.

3. Database Sharding and Partitioning

When a single MySQL instance reaches limits on disk, memory, or connections ("too many connections"), distribute data across multiple databases (sharding) and further divide large tables (partitioning). Typical thresholds: 10 M rows per table may trigger partitioning; 10 TB total data often justifies sharding. This reduces per‑node load and keeps query latency low.

4. Connection Pooling

Reuse existing connections instead of creating a new one for each request. Apply pooling to:

Database connections (e.g., HikariCP, DBCP)

HTTP client connections (e.g., Apache HttpClient pool)

Redis clients (e.g., JedisPool)

Thread pools (e.g., java.util.concurrent.ExecutorService) similarly limit thread‑creation overhead and improve parallel task execution.

5. Master‑Slave Replication

A single MySQL master typically handles ~500 TPS and ~10 k QPS. Adding read‑only slaves offloads read‑heavy traffic, protecting the master from overload. Be aware of replication lag (seconds to minutes) and eventual consistency when routing reads.

6. Caching

Store frequently accessed data in memory to reduce backend load and latency. Common caches:

Redis (single‑node can serve tens of thousands of QPS)

Local JVM caches (Caffeine, Guava Cache)

Memcached

Key cache pitfalls to handle:

Cache‑DB consistency

Cache avalanche (mass expiration)

Cache penetration (requests for nonexistent keys)

Cache stampede (thundering herd)

7. CDN for Static Assets

Serve images, CSS, JavaScript, and other static files via a Content Delivery Network. Edge nodes deliver content close to users, reducing latency and offloading backend servers.

8. Message Queues for Traffic Smoothing

Introduce a queue (e.g., Kafka, RabbitMQ, RocketMQ) to buffer bursts. If the application can process 2 k requests/s but receives 5 k, the queue absorbs the excess and releases work at a controlled rate. Overflow policies include dropping messages or returning error responses.

9. Elasticsearch for Search‑Heavy Loads

Use Elasticsearch as a distributed, horizontally scalable search engine. It handles large data volumes and high query concurrency without the need to scale relational databases for search‑specific workloads.

10. Circuit Breaker and Degradation

Wrap downstream calls with a circuit‑breaker (e.g., Hystrix, Resilience4j). When a service becomes slow or fails, the breaker opens, returning a fallback response and preventing cascading failures (service avalanche).

11. Rate Limiting

Protect limited resources (CPU, memory, network) by discarding excess requests during spikes. Implementations:

Guava RateLimiter (local token‑bucket)

Redis‑based distributed token bucket

Alibaba Sentinel (distributed flow control)

12. Asynchronous Processing

Replace synchronous calls with asynchronous workflows, typically via a message queue. For example, a flash‑sale request is placed on a queue, the user receives an immediate "processing" response, and the order is finalized later, freeing threads for new requests.

13. API Optimizations

Reduce payload size (e.g., protobuf, JSON with compression), use efficient serialization, and avoid unnecessary fields. Smaller payloads increase the number of requests that can be served per second.

14. Load Testing to Identify Bottlenecks

Before release, run stress tests with tools such as JMeter or LoadRunner. Measure:

Maximum concurrent users

Response time distribution

Resource utilization (CPU, memory, network, I/O)

Identify whether bottlenecks reside in the network, reverse proxy (Nginx), application code, database, or cache layers, then apply targeted mitigations.

15. Scaling Out and Traffic Switching

For sudden spikes, add more nodes (e.g., extra MySQL or Redis replicas) and optionally shift traffic between data centers or availability zones. Traffic routing can be controlled via DNS, load‑balancer weights, or service‑mesh policies.

References

极客时间高并发系统设计 40 问 – https://time.geekbang.org/column/article/192203

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices Scalability System Design high concurrency rate limiting

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.