Master Back‑of‑Envelope Calculations for System Design: Quick Estimation Techniques
This article explains system design fundamentals and shows how to use back‑of‑the‑envelope calculations to quickly estimate load, storage, cache, and bandwidth requirements, helping engineers make informed architectural decisions with simple math and practical examples.
System Design
System design is the process of designing the elements of a software system, including components, architecture, interfaces, modules, and database types, and defining how those components interact through data flow and access.
It is a very broad topic, and this article only scratches the surface.
Software system architecture is a conceptual model that defines a system’s structure, behavior, and views; an architectural description formally represents the system to support reasoning about its functionality, structure, and behavior.
What Is a Back‑of‑the‑Envelope Calculation?
A “Back‑of‑the‑Envelope calculation” is a quick, rough estimation method that uses simple arithmetic to obtain an approximate value for a complex problem. It was popularized by physicist Enrico Fermi to gauge the order of magnitude of a problem, and it is often used in system‑design interviews to assess capacity and performance requirements.
Example: calculate 5 * 9.667. A reasonable answer is any value between 45 and 50, or simply 50. The calculation process: 5×9.667≈5×10=50. The exact answer is 48.335.
What Is “Back‑of‑the‑Envelope” in System Design?
Back‑of‑the‑Envelope estimation is crucial in system design because it helps choose appropriate configurations and technologies, identify request/response sizes, database size, cache size, number of micro‑services, load balancers, and network bandwidth requirements.
Estimating the system’s scale before high‑level and low‑level design aids later decisions such as database technology, scaling, sharding, load balancing, and caching, and improves overall performance.
Four main areas require rough estimation:
Load estimation (traffic estimation)
Database storage estimation
Cache estimation
Bandwidth estimation
Understanding whether a system is read‑intensive or write‑intensive also guides design and estimation. For example, TinyURL is read‑intensive, while a web crawler is write‑intensive. Read‑intensive systems benefit more from caching than write‑intensive ones.
1. Load Estimation (Traffic Estimation)
Load estimation helps identify the number of requests a system must handle, expressed as daily active users (DAU) or requests per second.
Assume a write‑to‑read ratio of 1:100 and 1 million write calls per day. The new write requests per second are calculated as: 12 req/s * 100 ≈ 1200 req/sec. Memory tip: remembering that 1 million daily requests ≈ 12 requests per second allows quick conversion to 1.2 k req/sec.
2. Database Storage Estimation
Database storage estimation is vital because databases are the core component of any application.
Key questions include how much space is needed to store 1 million write requests, how much space 1 million web pages require, and how many 512 GB machines are needed.
Example calculations (assuming each write request is 1 KB and data is kept for 10 years):
1,000,000 × 1 KB = 1 GB (daily) 1 GB × 365 days = 360 GB (yearly) 360 GB × 10 = 3.6 TB (10 years)Adding 0.4 TB for audit, user, and security data yields roughly 4 TB total storage for 10 years.
3. Cache Estimation
Cache estimation lacks strict rules; some applications use 10‑30 % of database storage as cache, while others allocate 20‑30 % of frequently accessed data.
Example: with 1 GB of daily storage, a naive 20 % cache would be: (20 * 1 GB) / 100 = 200 MB A more realistic approach considers read traffic: with a 1:100 write‑to‑read ratio, 1 million writes and 100 million reads per day, each 1 KB, the total read size is 100 GB per day. A 20 % cache then equals:
(20% * 100 GB) = 20 GB4. Bandwidth Estimation
Bandwidth estimation evaluates network bandwidth needs, answering questions such as required upstream/downstream speeds and peak‑hour demand.
Assuming a 1:100 read‑write ratio, 1 million daily writes and 100 million daily reads, each request 1 KB, the bandwidth requirement is roughly 1 MB/s, dominated by read traffic.
How to Quickly Master Estimation?
The only answer is: practice, practice, and practice!
All estimations rely on basic mathematics and unit conversion, assuming you understand software engineering and system‑design principles.
Paper‑and‑pencil mistakes are cheap; early correction avoids costly refactoring later.
Speed and intuition come from repeated training, just like programming or problem solving.
Strong estimation ability = continuous practice × basic math × unit conversion; internalizing quick calculations lets you anticipate bottlenecks, avoid over‑design, and build cost‑effective systems that balance reliability and expense.
Big Data Technology Tribe
Focused on computer science and cutting‑edge tech, we distill complex knowledge into clear, actionable insights. We track tech evolution, share industry trends and deep analysis, helping you keep learning, boost your technical edge, and ride the digital wave forward.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
