System Capacity Design and Evaluation: Principles, Methods, and Case Study
This article explains how to estimate and plan system capacity—covering concepts like QPS, concurrency, peak traffic calculation, load testing, and practical steps with examples—to ensure services can handle both regular and surge workloads effectively.
Background
The unit holds an annual sports meet with a 2000 m run; normally 40 men and 20 women register, ten runners per race, requiring at least six races. Each race is allocated 30 minutes (20 min run + 10 min preparation/cleanup), so the event fits from 15:00 to 18:00 before the evening award ceremony.
This year the 4000 m race was cancelled, causing the 2000 m registration to surge by 50 participants, exceeding the original capacity and forcing half of the runners to be rescheduled to the following weekend, leading to complaints.
Concept
Design capacity is the process of estimating system capacity using various strategies; it is a core skill for architects.
Capacity design requires quantitative description of data volume, concurrency, bandwidth, user counts, message size, storage, CPU, memory, etc. The following example focuses on concurrency.
Analysis Process
Understanding Some Principles
TPS (Transactions Per Second) and QPS (Queries Per Second) measure throughput; concurrency is the number of simultaneous requests a system can handle.
Peak‑QPS calculation: 1) 80 % of daily traffic concentrates in 20 % of the time (peak period). 2) Formula: (total PV × 80 %) / (total seconds × 20 %) = peak QPS.
Relations: QPS = concurrency / average response time; concurrency = QPS × average response time.
When to Evaluate System Capacity
Three typical scenarios: 1) Temporary traffic spikes (e.g., 618, Double‑11, promotions). 2) Initial capacity assessment for a newly launched system. 3) Growth of existing system’s functionality, data, or active users requiring re‑evaluation.
Evaluation Steps (using concurrency as example)
1. Analyze Daily Total Visits
Gather realistic daily PV/UV from product or operations, or estimate for a new system.
2. Estimate Average QPS
Assume active hours ≈ 11 h → 4 × 10⁴ seconds. Average QPS = daily visits / 4 × 10⁴.
3. Estimate Peak‑Interval QPS
Use traffic curve or 80/20 rule; example shows peak QPS ≈ 2.58 × average QPS.
4. Determine Single‑Instance Maximum QPS
Load‑test (nGrinder, JMeter) to find the point where response time exceeds 2 s; adjust target to ≤ 1 s. Example: web‑tier instance limit ≈ 2 000 QPS.
5. Confirm with Redundancy
Given peak QPS ≈ 7 500 and per‑instance capacity ≈ 2 000, at least four web instances are needed.
Case Study: Book‑Reservation System
Applying the 80/20 rule to a system with 1 500 000 PV over a 9‑hour window yields peak QPS ≈ 185, concurrency ≈ 100, and after pessimism/optimism adjustments, a recommended test load of 200 + concurrent users with 50‑100 ms response time.
Summary
System capacity evaluation should be performed during temporary traffic surges, initial launch, and when the baseline grows; steps include analyzing total visits, estimating average and peak QPS, conducting performance tests, and adjusting based on redundancy.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.