How to Build High‑Concurrency, High‑Performance, High‑Availability Apps on the Cloud
This article explains the concepts of high concurrency, high performance, and high availability, and demonstrates how to design cloud‑native IaaS, PaaS, and SaaS layers to keep internet services resilient during massive traffic spikes such as Double‑11.
High Concurrency (High Concurrency)
High concurrency is a key factor in modern distributed system design, enabling a system to process many requests in parallel. It is measured by response time, throughput, QPS (queries per second), TPS (transactions per second), and concurrent user count.
Response Time: the time a system takes to respond to a request.
Throughput: the number of requests processed per second.
QPS: queries per second, similar to throughput.
TPS: transactions per second.
Concurrent Users: number of users simultaneously using the system.
High Performance (High Performance)
High performance means fast processing speed with low memory and CPU usage. Performance and concurrency are tightly coupled; improving one often improves the other. Optimizations differ for CPU‑bound and I/O‑bound workloads, and scaling resources should be balanced against utilization.
Avoid CPU idle caused by I/O blocking.
Minimize lock contention in multithreaded code.
Reduce overhead from creating and destroying excessive processes or threads.
High Availability (High Availability)
High availability describes a system designed to minimize downtime, aiming for service availability close to 100 %. Many companies target “four nines” (99.99 %), which translates to about 52.6 minutes of annual downtime.
Achieving high availability typically involves eliminating single points of failure, providing data redundancy, and designing fault‑tolerant architectures.
Building High‑Availability Internet Applications on a Cloud Platform
High availability can be addressed at three layers:
Resource HA (IaaS) : Ensure compute, storage, and network resources are redundant across zones and regions.
Application HA (PaaS) : Deploy application clusters (master‑slave or peer) using coordination services like Zookeeper or Raft, and manage lifecycle (upgrade, scaling, backup).
Service HA (SaaS) : Provide multi‑region SaaS services with strategies such as two‑site three‑center, disaster‑recovery, and active‑active architectures.
IaaS Service HA
Compute resources achieve HA by deploying identical instances in different zones within a region. Storage HA is provided by cross‑zone multi‑replica designs (e.g., NeonSan with RDMA). Network HA includes elastic IPs, VPCs, VxNets, load balancers, and NAT gateways distributed across zones.
PaaS HA
Application clusters can be master‑slave or peer. Master‑slave clusters often use Zookeeper for leader election, binding a write VIP to the master and a read VIP to slaves via load balancers. Peer clusters rely on load balancers for traffic distribution.
Cluster lifecycle management (upgrade, scaling, backup, recovery) is essential. QingCloud offers the AppCenter framework for easy creation, scaling, and elastic auto‑scaling of clusters, and the QingCloud Kubernetes Engine (QKE) integrates Kubernetes with cloud resources for a flexible container service.
SaaS HA
SaaS high availability focuses on multi‑region deployment. At the traffic layer, GSLB, dual EIPs, and dual load balancers provide redundancy. At the application layer, PaaS clusters are deployed across zones. At the data layer, databases are also multi‑zone.
Common architectures include:
Two‑site three‑center: a master service plus multiple slaves in the same city and at least one slave in a remote city.
Disaster‑recovery: independent full‑stack clusters with asynchronous data replication, operating in hot‑standby mode.
Active‑active: both sites serve live traffic, with data partitioned by region and synchronized asynchronously.
Key considerations for SaaS HA:
Cache must be 100 % rebuildable.
Business data sharding.
Failover time for master‑slave services.
Bandwidth for data synchronization.
An API gateway can protect SaaS services by providing rate limiting, circuit breaking, and graceful degradation.
Q&A
How to migrate services between cloud platforms?
Migration starts with data migration, ensuring online data synchronization before rebuilding services on the target platform and finally switching traffic to the new location.
If a local service becomes unavailable, how to switch to a remote site?
Switching typically involves scripts that handle traffic redirection and data loading (e.g., cache warm‑up) to avoid cache stampedes and database overload.
How to achieve data consistency on top of high availability and high performance?
The CAP‑style “impossible triangle” means you can only trade off among availability, performance, and consistency; often you settle for eventual consistency.
How to compare QingCloud PaaS services without server specifications?
QingCloud PaaS is evaluated by service plans and performance metrics such as QPS, allowing customers to select appropriate capacity (e.g., 200 k QPS).
Qingyun Technology Community
Official account of the Qingyun Technology Community, focusing on tech innovation, supporting developers, and sharing knowledge. Born to Learn and Share!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
