Operations 24 min read

Mastering Internet Performance Engineering and Capacity Planning

This article presents a comprehensive methodology for internet performance engineering, covering non‑functional quality goals, detailed metrics for application servers, databases, caches and message queues, a practical technical review outline, and a real‑world capacity‑planning case study with both maximal and minimal resource solutions.

21CTO
21CTO
21CTO
Mastering Internet Performance Engineering and Capacity Planning

Background

The article introduces a methodology for reviewing internet‑scale systems, emphasizing the need for engineers and architects to design for non‑functional quality such as performance, availability, scalability, and security before functional features are completed.

Goals

Overview of Non‑Functional Quality Requirements

Core qualities include high performance, availability, scalability, extensibility, and security. Additional qualities cover monitorability, testability, robustness, maintainability, reusability, and usability.

Specific Metrics

Application Server

Load‑balancing strategy and daily request volume

High‑availability strategy and interface peak loads

IO model (NIO/BIO) and average response time

Thread‑pool model, max response time, concurrent users, request size

Database

Replication model and current data volume

Failover strategy and daily data growth

Disaster‑recovery strategy and read peak

Archiving strategy, write peak, read/write separation, sharding, cache usage, transaction volume and consistency level

Cache

Replication model, cache size, hot‑cold data ratio

Failover strategy, cache item count, cache penetration risk

Persistence strategy, expiration time, large object handling

Eviction strategy, data structure, distributed‑lock support

Thread model, read/write peaks, scripting support

Pre‑heat method, write peak, race‑condition avoidance

Hash sharding strategy and implementation method

Message Queue

Replication model and daily data increment

Failover strategy and message expiration

Persistence strategy, read/write peaks, reliable delivery

Sharding strategy, write peak, average and max latency

Technical Review Outline

The outline guides reviewers through current state (business and technical background), requirements (business and performance), solution description (architecture, logic, data, exception handling), performance evaluation, pros/cons, risk assessment, workload estimation, and comparison of alternative solutions.

Classic Capacity and Performance Case

Scenario

A logistics system with two primary quality demands: maintaining member address data and processing asynchronous logistics orders with third‑party status polling.

Assumptions

2 billion members, 5 × 10⁴ daily growth

400 万 daily orders, peak 14 million orders on promotion days

5‑fold capacity redundancy, 30‑year address retention, 3‑year order retention, third‑party query limit 5 000 QPS

Solution 1 – Maximal Performance

Member address: 4 ports × 32 databases × 4 tables (4 master + 8 slave), 11 Redis nodes, 2 application servers

Logistics order & record: 16 ports × 32 databases × 8 tables (16 master + 16 slave), 1 Kafka broker, 2 processing nodes, 3 application servers

Solution 2 – Minimal Resources

Single database port with sharding prepared for future scaling

1 port × 32 databases × 16 tables for addresses (1 master + 1 slave)

1 port × 128 databases × 32 tables for orders (1 master + 1 slave)

Reference Standards

General guidelines: capacity = 5 × peak, 30‑year data retention after sharding, third‑party interface limit 5 000 QPS, ~1 KB per DB record.

MySQL: 1 000 QPS read per port, 700 TPS write per port, 50 million rows per table.

Redis: 40 000 QPS read, 40 000 TPS write per port, 32 GB memory per node.

Kafka: 30 000 QPS read, 5 000 TPS write per node.

DB2: 20 000 read/write peak per node, 100 million rows per table.

Conclusion

The article consolidates non‑functional quality goals, provides a detailed metric checklist for each component, offers a reusable technical review outline, and demonstrates capacity planning through a realistic e‑commerce case, helping practitioners design scalable, high‑performance internet systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureOperationscapacity planningperformance engineeringNon-functional Requirements
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.