Fundamentals 44 min read

Mastering Software Performance: From Axioms to Capacity Planning

This article explains fundamental performance concepts—defining response time and throughput, using axiomatic methods, analyzing bottlenecks with sequence diagrams and profiling, applying Amdahl’s Law, and guiding capacity planning to build reliable, high‑performance applications.

Efficient Ops
Efficient Ops
Efficient Ops
Mastering Software Performance: From Axioms to Capacity Planning

Thinking Clearly About Performance – This article is a translation of a three‑year‑old essay on performance problems that impressed the author enough to translate it again as his first translation effort.

Whenever the author encounters a performance issue, he recalls this article because it does not focus on specific tools (“the technique” layer) but instead builds a high‑level understanding (“the principle” layer) that can be applied across any technology stack.

Abstract

For developers, technical managers, architects, system analysts and project managers, building high‑performance complex software is extremely difficult. However, by understanding a few basic principles, solving and preventing performance problems becomes simpler and more reliable. This article presents those principles, covering goals, terminology, tools and decisions, and shows how to combine them to create long‑lasting high‑performance applications. Some examples come from Oracle experience, but the scope is not limited to Oracle products.

Table of Contents

Axiomatic Method

What Is Performance?

Response Time vs. Throughput

Percentage Metrics

Problem Diagnosis

Sequence Diagrams

Performance Profiling

Amdahl’s Law

Skewness

Minimizing Risk

Efficiency

Load

Queue Delay

Turning Point

Turning‑Point Correlation

Capacity Planning

Random Arrival

Correlation Delay

Performance Testing

Measurement

Performance as a Feature

Conclusion: Public Debate on Turning Points

References

1. Axiomatic Method

When the author joined Oracle in 1989, performance tuning (often called Oracle tuning) was difficult. Only a few claimed expertise, and many consulted them. The author was unprepared for the field. Later, while tuning MySQL, the experience felt similar to the work done 20 years earlier.

The author compares learning performance tuning to learning algebra at age 13, relying on “mathematical intuition” to solve equations like 3x + 4 = 13. Most people lack that intuition, often resorting to trial‑and‑error.

Trial‑and‑error works for simple equations but is slow and fails when the equation changes slightly. The author did not think deeply about a better method until age 15, when James R. Harkey introduced an axiomatic approach.

Harkey taught the author a step‑by‑step axiomatic method for solving algebraic equations, emphasizing recording both the steps and the thought process. The author’s homework looked like the following:

<code>3.1x + 4 = 13       
Subtract equal value
3.1x = 9
Divide by 3.1
x ≈ 2.903
</code>

This method consists of a series of logical, provable, auditable small steps and applies to algebra, geometry, trigonometry and calculus.

The author later created a similar rigorous axiomatic method for Oracle performance tuning and eventually extended it to all software performance optimization.

Our goal is to help you think clearly about how to optimize the performance of your software system.

2. What Is Performance?

Searching “Performance” on Google yields billions of results ranging from bike races to employee review processes. In the context of software, performance is “the amount of time a computer program takes to execute a given task.”

A “task” is a business‑oriented unit of work that can be nested. When a user talks about performance, they usually refer to the time the system takes to execute a series of tasks.

Response time is the duration of a task, measured per task, e.g., the time a Google search takes (0.24 s).

Another metric is throughput , the number of tasks completed in a given time interval, e.g., requests per second. Different stakeholders care about different metrics.

3. Response Time vs. Throughput

Generally, response time and throughput are inversely related, but the relationship is not exact. Consider a benchmark that reports 1000 tasks per second; the average response time is not simply 1/1000 s because parallelism and queuing affect the result.

Example 1 illustrates that with 1000 parallel service channels, each request may actually take 1 s, so average response time lies between 0 and 1 s. Therefore, you must measure response time directly.

Example 2 shows a client requiring 100 tps on a single‑core CPU where each task takes 0.001 s. If tasks arrive randomly from many users, contention can prevent achieving the required throughput, demonstrating that response time and throughput must be measured independently.

4. Percentage Metrics

Instead of stating “average response time < r seconds,” it is often better to use percentile‑based statements. Example 3 compares two lists with the same average response time (1 s) but different 90th‑percentile values (0.987 s vs. 1.273 s). The list with a higher 90th‑percentile indicates a larger proportion of dissatisfied users.

Using percentages aligns with customer expectations, e.g., “99.9 % of tracking shipments must complete within 0.5 s.”

5. Problem Diagnosis

Performance problems are often described in terms of response time, e.g., “Task X used to finish in < 1 s but now takes 20 s.” A good diagnosis starts by clearly defining the desired goal and quantifying it, such as “95 % of executions should be under 1 s.”

6. Sequence Diagrams

Sequence diagrams (UML) visualize the order of interactions between objects and are useful for illustrating response time. The article includes several diagrams (omitted here for brevity).

7. Performance Profiling

When many calls are involved, a table‑based performance profile is more practical than a sequence diagram. Example data shows that 70.8 % of response time is spent in

DB:Fetch()

, highlighting where optimization effort should focus.

8. Amdahl’s Law

Amdahl’s Law states that the performance gain from speeding up a component depends on how often that component is used. If a component accounts for only 5 % of total response time, the maximum possible improvement is 5 %.

9. Skewness

Skewness measures the non‑uniformity of values. Example 6 shows that halving the number of

DB:fetch()

calls does not halve the response time because the distribution of call costs matters.

<code> A = {1, 1, 1, 1}
 B = {3.7, .1, .1, .1}
</code>

In list B, removing the two longest calls reduces response time dramatically, whereas removing two short calls has little effect.

10. Minimizing Risk

Changing one part of a system can break another. The author shares an anecdote about adjusting Oracle network packet size only for problematic Java applications to avoid global impact.

11. Efficiency

Improving efficiency means reducing wasted work, such as issuing a single prepared statement for bulk inserts instead of thousands of individual statements, or filtering results early to avoid unnecessary buffer accesses.

12. Load

Load is the resource competition caused by concurrent tasks. Higher load increases queue delay and correlation delay, leading to longer response times, similar to traffic congestion.

13. Queue Delay

Queue delay is the time a task waits for a service opportunity. The article presents the M/M/m queuing model, where response time R = Service time S + Queue delay Q.

<code>R = S + Q
</code>

14. Turning Point

The turning point is the load level where throughput is maximized while response time degradation remains small. It is the point where the line from the origin is tangent to the response‑time curve.

15. Turning‑Point Correlation

Every resource (CPU, disk, network) has its own turning point, typically lower than theoretical values due to imperfect scalability. Staying below the turning point for random‑arrival workloads prevents severe performance swings.

16. Capacity Planning

Capacity planning uses the turning point to define how much resource capacity is needed to handle peak load without exceeding the turning point.

17. Random Arrival

Random arrival of tasks creates bursts that can exceed the turning point, causing queue delay spikes. Short bursts (e.g., less than 8 seconds) are usually tolerable.

18. Correlation Delay

Correlation delay arises from contention on shared resources (e.g., enqueue, buffer busy waits, latch release). It cannot be modeled by the ideal M/M/m model because service channels are not truly independent.

19. Performance Testing

Testing must balance effort and coverage; insufficient testing leaves hidden problems, while excessive testing is wasteful. A moderate testing level is recommended.

20. Measurement

Throughput is easy to measure; response time is harder. Relying on surrogate metrics (e.g., call counts) can lead to false positives or negatives.

21. Performance as a Feature

Performance is a functional feature that must be designed and built, not an afterthought. Measuring performance in production is essential for ongoing improvement.

Conclusion: Public Debate on Turning Points

The article recounts a 20‑year‑old debate about the usefulness of defining a turning point, citing differing opinions from Stephen Samson, Neil Gunther, and others.

References

CMG (Computer Measurement Group)…

Eight‑second rule…

Garvin, D. 1993…

General Electric Company…

Gunther, N. 1993…

Knuth, D. 1974…

Kyte, T. 2009…

Millsap, C. 2009…

Millsap, C. 2009…

Millsap, C. 2009…

Millsap, C., Holt, J. 2003…

Oak Table Network…

PerformanceCapacity Planningthroughputresponse timesoftware optimization
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.