Backend Development 27 min read

High Concurrency: Principles, Impacts, and Practical Solutions for Backend Systems

This comprehensive guide explains the nature of high concurrency, distinguishes it from parallelism, outlines its potential consequences across application, database, and service layers, and presents a systematic set of mitigation strategies—including rate limiting, asynchronous processing, redundancy, caching, and queue‑based designs—supported by real‑world case studies and code examples.

Java Captain

Apr 4, 2025

High Concurrency: Principles, Impacts, and Practical Solutions for Backend Systems

High concurrency has become an unavoidable topic for backend developers, especially in scenarios such as e‑commerce flash sales, social app interactions, and other traffic spikes that generate massive simultaneous requests.

The article defines high concurrency as a large number of client requests arriving at the same moment, requiring the server to respond quickly. It clarifies the difference between concurrency (logical simultaneity over a time interval) and parallelism (physical simultaneity on multiple execution units), noting that concurrency is a broader concept while parallelism is a special case.

It discusses how to judge whether a system is experiencing "high" concurrency, emphasizing that the threshold is relative to the business context, hardware resources, and performance goals rather than an absolute QPS number.

The consequences of insufficient concurrency handling are illustrated in a detailed table covering application‑layer response delays, database lock contention and data inconsistency, and service‑layer resource exhaustion leading to server crashes.

To address these issues, the article proposes three overarching strategies—"cut", "buffer", and "multiply"—and expands them into seven concrete techniques: rate limiting, asynchronous processing, redundancy (clustering, caching, staticization), and others. Each technique is described with practical examples, such as token‑bucket rate limiting, message‑queue smoothing, Redis pre‑deduction for flash‑sale inventory, and static page generation.

Real‑world optimization cases are presented, including:

Adopting stateless API services with Nginx load balancing to enable horizontal scaling.

Identifying and fixing a CPU‑intensive loop that queried large datasets per request, then mitigating it with API scaling and code refactoring.

Improving cache‑aside patterns by redesigning cache keys and adding empty‑value caching to prevent cache penetration.

Implementing master‑slave database separation to offload read‑heavy workloads.

Handling cache‑snowball and cache‑penetration scenarios during peak events.

Using Redis for pre‑deduction of inventory to avoid database lock contention during flash‑sale redemption.

Code examples illustrate the cache‑aside pattern and the Redis pre‑deduction workflow:

<span>// Cache‑aside pattern</span>
var redisKey = "rankinglist:" + DateTime.Now.ToString("yyyyMMdd");
var rankingListCache = redis.Get(redisKey); // from cache
if (rankingListCache != null) {
    return rankingListCache;
}
var data = db.RankingList.GetList(); // heavy DB query
if (data.Any()) {
    redis.Set(redisKey, data, 3600); // cache valid data
    return data;
}
return new List();

After redesigning the cache key to be based on ranking type and caching empty results for a short period, the system avoided massive DB hits during midnight cache expiration.

UPDATE TableA SET Stock = Stock - 1 WHERE Stock > 0;

To further reduce lock contention, a Redis‑based pre‑deduction scheme is introduced:

<span>// Sync stock to Redis</span>
var redisKey = "stock:" + productId;
redis.Set(redisKey, db.GetStock(productId));

<span>// Decrement in Redis</span>
if (redis.Decr(redisKey) >= 0) {
    // success, create order
} else {
    // out of stock
}

<span>// Sync back to DB when Redis stock reaches 0</span>
db.UpdateStock(productId, 0);

The article concludes that high concurrency optimization is an ongoing "attack‑defense" battle, requiring a balanced mix of technical measures and business‑level decisions, and that there is no single silver‑bullet solution.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Caching High concurrency database optimization rate limiting asynchronous processing Backend Performance

Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.