Backend Development 19 min read

High Concurrency Architecture and Strategies for Scalable Backend Systems

This article presents a comprehensive guide to designing high‑concurrency backend solutions, covering server architecture, load balancing, database clustering, caching layers, message queues, asynchronous processing, service‑oriented design, redundancy, and automation to ensure reliable performance under massive user traffic.

Architect's Guide

Jun 24, 2022

High Concurrency Architecture and Strategies for Scalable Backend Systems

High Concurrency Overview

High concurrency often occurs in scenarios with a large number of active users, such as flash‑sale events and timed red‑packet collection.

To keep business operations smooth and provide a good user experience, we need to estimate the expected concurrency based on business scenarios and design a suitable high‑concurrency handling scheme.

After years of e‑commerce development, the author summarizes various pitfalls encountered under high load and shares this archive for reference.

Server Architecture

As a business matures, the server architecture evolves from a single node to a cluster and eventually to distributed services.

A high‑concurrency service requires a solid architecture: load balancers, master‑slave database clusters, master‑slave NoSQL caches, and CDN for static assets.

Typical components include:

Servers

Load balancing (e.g., Nginx, Alibaba Cloud SLB)

Resource monitoring

Distributed deployment

Databases

Master‑slave separation, clustering

DBA table and index optimization

Distributed deployment

NoSQL

Master‑slave clustering

Redis, MongoDB, Memcache

CDN for static files (HTML, CSS, JS, images)

Concurrency Testing

High‑concurrency business requires load testing to estimate the maximum supported traffic.

Testing can be performed on third‑party platforms (e.g., Alibaba Cloud performance testing) or self‑hosted servers using tools such as Apache JMeter, Visual Studio Load Test, or Microsoft Web Application Stress Tool.

Practical Schemes

General Scheme

Daily user traffic is large but dispersed; occasional spikes occur when users gather.

Typical scenarios: user sign‑in, user center, order queries, etc.

Key ideas:

Prefer cache reads; fall back to DB only when cache miss occurs.

Distribute user data across cache shards using a hash of the user ID.

Cache the result after DB query to reduce future hits.

Beware of race conditions that may cause duplicate point awards.

Examples:

Sign‑in

Compute user‑specific cache key and check Redis hash.

If found, return sign‑in info.

If not, query DB, sync result to Redis, and return.

All DB writes are performed within a transaction.

Order list

Cache only the first page (40 items) in Redis.

Read from cache for page 1, otherwise query DB.

User center

Cache user profile after DB lookup.

Other business

For shared cache data, consider admin‑driven updates or DB‑level locking to avoid massive DB hits under concurrency.

Message Queue

Flash‑sale and similar activities generate massive concurrent requests.

Scenario: timed red‑packet collection.

Using a message queue (e.g., Redis list) allows the system to enqueue user participation records and process them asynchronously with multiple consumer threads, preventing DB overload.

First‑Level Cache

When cache server connections become a bottleneck, a first‑level cache on the application server can store hot data with short TTL, reducing the number of connections to the NoSQL cache layer.

Static Data

For data that changes infrequently, generate static JSON/XML/HTML files and serve them via CDN. Clients first request from CDN; only on cache miss do they fall back to the backend.

Other Techniques

Clients can send a version identifier; the server returns data only when the version differs, saving bandwidth.

Layering, Segmentation, and Distribution

Large websites need long‑term planning: layer the system, split core business into modules, and deploy them distributedly.

Layering: separate application, service, and data layers.

Segmentation: break complex domains (e.g., user center) into smaller modules.

Distribution: deploy each module on independent servers, use load balancers, DB clusters, and CDN.

Cluster

Deploy multiple identical application servers behind a load balancer; use master‑slave DB clusters for high availability and scalability.

Asynchronous Processing

Database operations are the main bottleneck under high load. By offloading persistence to asynchronous workers (e.g., via a message queue), the API can respond quickly while the background process handles DB writes.

Cache

Cache frequently accessed, rarely changing data in memory stores (Redis, Memcache) or in‑process memory, and optionally use client‑side version checks to avoid unnecessary requests.

Service‑Oriented Architecture

Extract core functionalities into independent services (SOA or micro‑services) with their own databases and caches, enabling loose coupling, high availability, and easy scaling.

Redundancy and Automation

Provide standby servers, database backups, and automated monitoring/alerting to quickly replace failed components and reduce manual intervention.

Conclusion

High‑concurrency architecture evolves continuously; solid foundational design simplifies future expansion and ensures system resilience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Architecture load balancing high concurrency asynchronous processing

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.