Mastering High‑Concurrency Architecture: From Metrics to Read/Write Splitting and Caching
This article explains the essential elements of high‑concurrency system design—performance, availability, scalability—introduces quantitative metrics, classifies read‑heavy and write‑heavy scenarios, and details practical solutions such as database read/write separation, local and distributed caching, cache‑penetration and avalanche mitigation, and CQRS implementation for billion‑user applications.
Key Points of High‑Concurrency Architecture Design
High concurrency is a core requirement for billion‑user applications; its essential attributes are high performance, high availability, and scalability.
Necessary Conditions for a High‑Concurrency System
High performance means strong parallel processing ability, reduced hardware consumption, and fast response times that keep users satisfied.
High availability ensures the system runs stably over long periods without frequent crashes or downtime.
Scalability allows horizontal expansion to handle growing or burst traffic.
Metrics for Evaluating High‑Concurrency Systems
Performance is best measured by response‑time percentiles (PCTn) rather than simple averages; typical targets are an average response time under 200 ms and PCT99 ≤ 1 s.
Availability is expressed as “nines” (e.g., 99.95 % uptime) and monitored with alerts.
Scalability is defined as the throughput‑increase ratio divided by the node‑increase ratio, with 70‑80 % considered acceptable.
Classification of High‑Concurrency Scenarios
Requests are divided into read‑heavy, write‑heavy, and balanced scenarios, each requiring specific solutions.
Database Read/Write Splitting
Separate master (write) and slave (read) databases; writes go to the master and are replicated to slaves.
Routing can be implemented via a proxy that forwards SQL statements or embedded in the application using frameworks such as GORM or ShardingJDBC.
Handling Master‑Slave Replication Lag
Options include synchronous replication, forced reads from the master for latency‑sensitive operations, and session‑based read routing.
Caching Strategies
Local cache stores data in process memory for fast access but cannot be shared across services, is language‑specific, and loses data on restart.
Distributed cache (e.g., Redis) provides shared, language‑agnostic, persistent storage with high availability and scalability.
Cache Eviction Policies
FIFO – first‑in‑first‑out (low hit rate).
LFU – least‑frequently‑used.
LRU – least‑recently‑used.
More advanced policies such as W‑TinyLFU are also used in practice.
Cache Problems and Solutions
Cache penetration : requests for nonexistent data bypass the cache and hit the database; mitigated by storing empty placeholders or using a Bloom filter to filter illegal keys.
Cache avalanche : massive simultaneous expirations overload the database; mitigated by randomizing TTLs and deploying highly available Redis clusters or Redis‑Cluster.
CQRS (Command Query Responsibility Segregation)
CQRS separates command (write) and query (read) paths. Writes go to the master database, change events are sent through a message queue, and reads are served from read replicas or the cache.
In a distributed‑cache scenario, the database is the write store, Redis is the read store, and binlog listeners act as the message channel.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
