Mastering High‑Performance, High‑Concurrency, High‑Availability Backend Systems
This article shares a backend engineer's practical methodology for building systems that simultaneously achieve high performance, high concurrency, and high availability, covering performance optimization, scaling strategies, fault‑tolerance techniques, and real‑world case studies from B‑ and C‑side logistics platforms.
Overview
The article presents a practical methodology for building backend systems that satisfy the three‑high requirements—high performance, high concurrency, and high availability. It distinguishes between technical complexity (typical of C‑end services) and business complexity (typical of B‑end or M‑end services) and shows how to address both through systematic design, scaling, and fault‑tolerance techniques.
High‑Performance
Methodology
Performance is limited by three factors: computation, communication, and storage. Typical bottlenecks include heavy GC pauses, slow downstream services, large tables, and inefficient SQL. Optimisation should be approached from both read and write perspectives.
Read Optimisation – Cache + Database
Read‑heavy scenario : Write synchronously to the database, then invalidate or delete the cache entry. The database handles writes; the cache serves reads, providing low‑latency access.
Write‑heavy scenario : Update the cache synchronously, propagate the change to the database asynchronously. The cache absorbs write traffic, while the database is eventually consistent. This pattern is used for JD Logistics order‑relation documents.
Typical workflow:
// Pseudo‑code for read‑heavy flow
function getItem(id) {
let value = cache.get(id);
if (value == null) {
value = db.query("SELECT * FROM table WHERE id=?", id);
cache.set(id, value);
}
return value;
}
// Write‑heavy flow (async DB update)
function updateItem(id, data) {
cache.set(id, data); // fast response
asyncQueue.push(() => db.update(id, data));
}Write Optimisation – Asynchronous Processing for Flash‑Sale (秒杀) Scenarios
During traffic spikes, the order‑taking API must return instantly. The request is accepted, the order ID is returned, and the actual order creation is delegated to a message queue. Stock is cached; after a successful stock deduction, an SMS is sent to the user.
High‑Concurrency
Methodology
Horizontal scaling – add machines and database shards. Common during large promotions (e.g., 618, Double‑11).
Vertical scaling – split databases (sharding, read/write separation) to increase connection limits.
Depth scaling (regional isolation) – deploy independent service units per geographic region (e.g., a Beijing unit serves Beijing users) to reduce latency and avoid single‑point bottlenecks.
Practical Application – DDD in Retail Logistics
Domain‑Driven Design (DDD) is applied after a deep understanding of business processes. Core domains include:
Product Service
Order
Payment & Settlement
Fulfilment
Both forward (merchant places order → logistics assigns courier → delivery) and reverse (user requests after‑sale return) flows are modelled. The resulting micro‑service decomposition follows the DDD bounded contexts, enabling independent scaling and clearer ownership.
High‑Availability
Application Layer
Rate limiting – protects services from traffic spikes. Common algorithms:
Counter – simple but not smooth.
Sliding window – time‑controlled, may drop excess traffic.
Leaky bucket – smooths bursts.
Token bucket – dynamic token size for adaptive control.
Circuit breaking & degradation – prevents downstream failures from cascading. Circuit breakers stop calls to unhealthy services; degradation provides graceful fall‑backs (manual switch via configuration centre or automatic via Hystrix).
Timeout funnel – set progressively shorter timeouts from upstream to downstream services to avoid upstream thread exhaustion.
Retry strategy – limit retry count and ensure idempotency. Excessive retries cause storm effects, especially in long call chains.
Isolation – multiple dimensions:
System‑level (different services for different business levels).
Environment (dev, test, pre‑prod, prod).
Data (tenant‑based, schema‑based, or separate databases).
Core vs. non‑core flow (critical services receive more resources).
Read/write (CQRS, master‑slave DB).
Thread‑pool isolation.
Storage Layer
Replication types
Master‑slave – writes go to master; reads can be served by slaves.
Multi‑master – any master can accept writes; changes are propagated.
Leaderless – writes to any node; reads from multiple nodes with conflict resolution.
Sharding (partitioning) – each record belongs to exactly one partition. Methods include range‑based and hash‑based.
Redis Cluster – 16,384 slots; keys are hashed to slots, which are assigned to shards with master‑slave pairs. If a master fails, a slave is promoted.
Elasticsearch index – fixed number of primary shards; replica count is configurable. Primary shards handle reads/writes; replicas provide read‑only copies for redundancy.
Kafka topic – each topic is partitioned; each partition has a leader (read/write) and followers (replicas) for fault tolerance.
Deployment Layer
Deployment evolves from single‑machine to multi‑region, using redundancy and load balancing for disaster recovery.
Current production uses Docker containers across multiple data‑centers. Service groups are separated by business importance and traffic volume. Databases and Redis are deployed in dual‑data‑center setups; Elasticsearch runs in a single data‑center.
Conclusion
Building a “three‑high” backend system requires a balanced focus on performance, concurrency, and availability. Key techniques include:
Cache‑first read paths and asynchronous write‑back for write‑heavy workloads.
Message‑queue‑driven order processing for flash‑sale spikes.
Horizontal, vertical, and depth scaling combined with DDD‑driven micro‑service boundaries.
Rate‑limiting, circuit breaking, timeout funnels, bounded retries, and multi‑layer isolation for resilience.
Replication and sharding (MySQL, Redis, Elasticsearch, Kafka) to avoid single points of failure.
Multi‑region containerised deployment with redundant data‑centers.
Applying these patterns enables backend platforms to serve both B‑end and C‑end workloads at massive scale while maintaining low latency (TP99/TP999) and high availability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
