Operations 12 min read

Designing System Capacity: From Event Scenarios to Precise QPS Planning

This article explains how to assess and design system capacity by analyzing real‑world scenarios—such as a company sports event—calculating required concurrency, average and peak QPS using the 80/20 rule, performing load tests, and determining instance counts to ensure reliable performance under varying traffic spikes.

ITFLY8 Architecture Home

Jun 16, 2021

Designing System Capacity: From Event Scenarios to Precise QPS Planning

Background

The organization holds an annual sports meet with a 2000 m run. Typically 40 male and 20 female participants register, and only ten runners can compete simultaneously, requiring at least six races. Each race, including preparation and cleanup, takes about 30 minutes, so the total time is three hours, scheduled from 3 pm to 6 pm before the evening awards ceremony. This year the 4000 m run was cancelled, increasing 2000 m registrations by 50, causing a capacity shortfall; half of the participants were postponed to the following weekend, illustrating the importance of accurate capacity design.

Concept

Design capacity is the process of estimating system capacity using strategic methods; it is a core skill for architects. It involves quantifying data volume, concurrency, bandwidth, registered and active user counts, message length, image size, storage, CPU, memory, etc. The example focuses on concurrency.

Analysis Process

Understanding Principles

TPS (Transactions Per Second) measures how many transactions a system processes each second.

QPS (Queries Per Second) measures request throughput; it indicates how many requests a server handles per second.

Concurrency is the number of simultaneous requests a system can handle, reflecting load capacity.

Peak QPS can be calculated using the 80/20 rule: 80% of traffic occurs in 20% of the time. The formula is

(Total PV × 80%) / (Total seconds per day × 20%) = Peak QPS

Relationships: QPS = Concurrency / Average Response Time and Concurrency = QPS × Average Response Time.

When to Evaluate System Capacity

Three typical scenarios require timely capacity assessment:

Temporary traffic spikes (e.g., 618, Double 11, New Year promotions) where traffic may increase severalfold.

Initial system capacity assessment when a new system is launched.

Changes in capacity baseline as features grow, data volume rises, and daily active users increase.

Evaluation Steps

1. Analyze Daily Total Visits

Estimate daily PV/UV from historical data or product/operations forecasts. For a new system, predict possible PV and UV.

2. Estimate Average QPS

Divide the expected total visits during the active period by the number of seconds in that period. Example: 200 W messages with a 10% click‑through rate yields 200 W visits in one hour, resulting in an average QPS of 555.5.

3. Estimate Peak QPS

Use traffic curves or the 80/20 rule. In a sample cloud system with an average daily QPS of 2 900, the peak was about 2.58 × average, giving a peak QPS of roughly 7 482.

4. Determine Single‑Instance QPS Limit

Load testing tools (nGrinder, JMeter) show a single web instance can sustain about 2 500 QPS before response time exceeds 2 seconds; the target is reduced to 2 000 QPS for a safety margin.

5. Confirm Final Instance Count

With a peak of 7 500 QPS and 2 000 QPS per instance, at least four web instances are needed. Cache and DB clusters are sufficiently robust to handle their share of the load.

Case Study: Book Reservation System

Using the 80/20 rule, a system with 1 500 000 total PV over 9 hours (32 400 seconds) yields a peak QPS of about 185. Concurrency = 185 × 0.5 s ≈ 92.5, rounded to 100, then adjusted to 200 for testing. Performance testing confirmed response times of 50‑100 ms at 200+ concurrent users.

Summary

System capacity design should be performed during:

Temporary traffic surges.

Initial system launch.

When the capacity baseline shifts due to growth.

The steps include analyzing total daily visits, estimating average and peak QPS, conducting performance tests to find instance limits, and adjusting instance numbers based on redundancy and actual results. Applying these methods to the sports‑event example shows that early capacity redesign could have prevented the scheduling issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

concurrency System Design capacity planning load testing QPS

Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.