Fundamentals 11 min read

How to Design System Capacity: From Real-World Event Planning to QPS Estimation

This article explains how to assess and design system capacity by translating a real‑world sports event scenario into concrete metrics such as daily visits, average and peak QPS, concurrency, and instance limits, while outlining practical steps, formulas, and a book‑reservation case study.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How to Design System Capacity: From Real-World Event Planning to QPS Estimation

Background

Every year our organization holds a sports meet with a 2000 m race. Typically 40 men and 20 women register, and only ten runners can compete simultaneously, requiring at least six heats. Each heat lasts 30 minutes (20 minutes race time plus preparation and cleanup).

When the 4000 m race was cancelled this year, the 2000 m registrations jumped by 50, exceeding the original capacity and forcing half of the participants to race the following weekend, causing complaints.

This story illustrates the importance of capacity design: when business demand changes, failing to anticipate the impact leads to disruption.

Concept

Design capacity is the technical process of estimating system capacity using strategic analysis; it is a core skill for architects.

Capacity design requires concrete data such as data volume, concurrency, bandwidth, registered and active user counts, message size, image size, storage, CPU, and memory.

We will use concurrency as an example to demonstrate the analysis.

Analysis Process

Understanding Key Metrics

TPS (Transactions Per Second) measures transaction throughput.

QPS (Queries Per Second) measures request throughput.

Concurrency is the number of simultaneous requests a system can handle.

Peak QPS calculation: 80% of daily traffic occurs in 20% of the time (the peak window).

Formula: (Total PV × 80%) / (Seconds per day × 20%) = Peak QPS PV = page views, UV = unique visitors, throughput = processed requests per unit time, RT = average response time.

Relationship: QPS = Concurrency / Average RT and

Concurrency = QPS × Average RT

When to Evaluate System Capacity

1. Temporary traffic spikes (e.g., 618, Double‑11, holiday promotions).

2. Initial system capacity assessment before launch.

3. Changes in capacity baseline as features grow, data volume increases, or daily active users rise.

Capacity evaluation includes data volume, concurrency, bandwidth, CPU, memory, and disk.

Evaluation Steps

1. Analyze Daily Total Visits

Gather realistic daily PV/UV numbers from product or operations, or estimate for a new system.

2. Estimate Average QPS

Assume active hours of ~11 hours (≈40 000 seconds). Average QPS = Daily visits ÷ 40 000.

3. Estimate Peak QPS

Use traffic curves or the 80/20 rule. Example: daily QPS 2 900, peak ≈ 2.58 × average = 7 482 QPS.

4. Determine Single‑Instance QPS Limit

Conduct load testing (e.g., nGrinder, JMeter). Our standard: response time > 2 s indicates a bottleneck; we aim for ≤ 1 s, so we adjust the limit accordingly.

5. Confirm with Redundancy

If peak QPS is 7 500 and a single web instance can safely handle 2 000 QPS, at least four instances are needed.

Case Study: Book Reservation System

Using the 80/20 rule, a 9‑hour peak window (32 400 seconds) and total PV = 1 500 000 yields peak QPS ≈ 185. Concurrency = QPS × Avg RT (0.5 s) ≈ 92.5, rounded to 100, then adjusted to 200 for testing.

Performance testing confirmed the system supports > 200 concurrent users with response times of 50–100 ms.

Summary

System capacity design should be performed during temporary traffic spikes, initial launch assessments, and when the baseline changes due to growth.

The steps are: analyze daily visits, calculate average QPS, estimate peak QPS (traffic curve or 80/20 rule), perform load testing to find instance limits, and adjust based on redundancy.

Applying these steps to the opening sports event example shows that early capacity re‑evaluation could have prevented the scheduling issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

System DesignPerformance Testingcapacity planningqps estimation
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.