Operations 12 min read

System Capacity Design and Evaluation: Principles, Methods, and Case Study

This article explains how to estimate and design system capacity—including traffic, concurrency, QPS, and resource requirements—by illustrating concepts, analytical formulas, peak‑load assessment techniques, performance testing, and a practical case study of a book‑reservation system.

IT Architects Alliance

Jun 6, 2021

System Capacity Design and Evaluation: Principles, Methods, and Case Study

Background

The organization holds an annual sports meet with a 2000 m race. Normally 60 participants (40 men, 20 women) run on a single rubber track, 10 per heat, requiring at least six heats. Each heat is allocated 30 minutes (20 minutes race time plus preparation and cleanup), totaling 3 hours.

When the 4000 m race was cancelled this year, 50 additional participants signed up for the 2000 m race, exceeding the planned capacity and forcing half of the participants to be rescheduled to the following weekend, causing complaints.

This real‑world story illustrates the importance of capacity design: when business demand changes, failure to anticipate the impact leads to operational issues.

Concept

Design capacity is the process of estimating system capacity using various strategies; it is a core skill for architects.

Capacity requirements include data volume, concurrency, bandwidth, user counts (registered, active, online), message length, image size, storage, CPU, memory, etc.

The article uses concurrency as an example to demonstrate the analysis process.

Analysis Process

Understanding Some Principles

TPS (Transactions Per Second) – number of transactions processed each second.

QPS (Queries Per Second) – a common throughput metric indicating how many requests a server handles per second.

Concurrency – the number of simultaneous requests a system can handle, reflecting load capacity.

Peak‑QPS calculation: 1) 80% of daily visits occur in 20% of the time (peak period). 2) Formula: (Total PV × 80%) / (Daily seconds × 20%) = Peak QPS.

Definitions of PV (Page View) and UV (Unique Visitor) are provided, along with throughput and response time concepts.

Relationships: QPS = Concurrency / Average Response Time; Concurrency = QPS × Average Response Time.

When to Evaluate System Capacity

Three main scenarios require timely capacity assessment:

Temporary traffic spikes (e.g., 618, Double‑11 promotions).

Initial system capacity estimation for a newly launched service.

Growth in baseline capacity due to added features, increased data flow, or higher daily active users.

Capacity evaluation covers data volume, concurrency, bandwidth, CPU, memory, disk, etc. The article proceeds with concurrency as a concrete example.

Evaluation Steps

1. Analyze Daily Total Visits

Estimate daily PV/UV based on historical data or product/operations forecasts. Example: an activity pushes 20 million messages in one hour, with a 10% click‑through, resulting in 2 million additional visits.

2. Estimate Average QPS

Assuming an 11‑hour active window (≈ 40 000 seconds), average QPS = Total Visits / 40 000. Example: 200 W visits → 555.5 QPS; Baidu homepage example → 1 250 QPS.

3. Estimate Peak‑Period QPS

Use traffic‑curve analysis or the 80/20 rule. In a sample cloud system, average QPS is 2 900; peak QPS is about 2.58 × average ≈ 7 482 QPS.

4. Determine Single‑Instance QPS Limit

Load testing (e.g., nGrinder, JMeter) shows a web‑tier instance caps at 2 500 QPS before response time exceeds 2 seconds; a safer target is 2 000 QPS.

5. Confirm Final Capacity with Redundancy

Given a peak of 7 500 QPS and a safe per‑instance capacity of 2 000 QPS, at least four web instances are needed to handle the load.

Case Study: Book Reservation System

Using the 80/20 rule over a 9‑hour peak window (32 400 seconds), with total PV = 1 500 000, the peak QPS is calculated as (1 500 000 × 0.8) / (32 400 × 0.2) ≈ 185 QPS. Assuming an average response time of 0.5 s, concurrency ≈ 92.5, rounded to 100, then adjusted to 200 for safety.

A table shows pessimistic, normal, and optimistic concurrency estimates (30 % → 80, 40 % → 100, 30 % → 300). Performance testing suggests supporting 200+ concurrent users with response times of 50‑100 ms.

Summary

System design capacity evaluation should be performed when facing temporary traffic surges, during initial system rollout, or when baseline capacity grows.

The steps are: 1) Analyze total daily visits; 2) Estimate average QPS; 3) Assess peak‑period QPS via traffic curves or the 80/20 rule; 4) Conduct performance stress testing to find instance limits; 5) Adjust based on redundancy and actual measurements.

Applying these methods to the opening sports‑meet example would have allowed the organizers to redesign capacity in advance and avoid the scheduling issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

scalability concurrency System Design performance testing capacity planning QPS

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.