This is a repost from Medium and the third part of the 13 principles of next generation Enterprise Software Technology post.

In this write up I’m going to focus primarily on building the fundamental knowledge and required context, definitions, dynamics between all of the concepts that are necessary to start talking and reasoning about the speed and performance of the software. So let’s build up our performance engineering acumen :).

Performance — the characteristic of performing a system operation. The main characteristic is Processing Time — how long did it take for the system to perform a given operation. (Please do not confuse with technical Response Time).

One can’t talk about performance without introducing another concept — Throughput. This is the number of requests to perform a system operation within a defined timeframe. For most of the use cases, requests per second is going to be a close to ideal measure.

Resources — everything that’s required to perform a system operation, that is, the CPU, RAM, I/O, and HDD/SSD. Actually, there is more to it than that, but for now it is worth acknowledging in this way the fact that they play a very important role in all of this.

So with given fixed Resources and constant and defined Throughput, Processing Time should be constant. Ideally.

Now, with fixed Resources, increased Throughput effects in increased Processing Time (the system operations take longer). Repeating the act of increasing the Throughput while having fixed Resources will increase the Processing Time to the point that systems will stop responding at all.

Now it’s all really about how quickly Processing Time increases. Or, for us, what matters is actually how slowly it increases. Systems that have linear growth of Processing Time as Throughput increases are considered to be much better than systems that have more dynamic, exponential growth of Processing Time.

In reality, this a bit of a simplified model, as we don’t take into consideration Data Volume and its growth. Similar to increasing Throughput, increasing Data Volume while having fixed Resources effects in increased Processing Time. Increasing it further will lead eventually to a system not responding.


Now let me briefly introduce another definition, which represents a goal, a number that’s believed to be good for the customers. This is our Processing Time SLA (Service Level Agreement).

For the sake of argument let’s take the most pessimistic scenario, that Throughput grows and our system stops responding. How do one handle such thing?

Well, we handle it by Scaling Out. That means adding more Resources (CPU, RAM, I/O, or HDD/SSD) to handle more commissioned work. That, by definition, assumes that the infrastructure or implementation of the system is designed to Scale Out.

So, if the whole system is designed to Scale Out, one is in a relatively good position.

Scaling Out means adding more Resources, and those Resources cost $$$. If the whole system can Scale Out, one would only be talking about Cost Effectiveness.

Cost Effectiveness would be a function of how fast Processing Time grows with the growth of Throughput or Data Volume.

There’s another way to Scale Out — changing the implementation. It’s not considered as Scaling Out really as it takes a lot of time. (Actually, I would call it an optimization). So, by definition, Scaling Out needs to be something done quickly, in connection with managing Resources.

While we are talking about Scaling Out, there might be two ways to do that, horizontal scaling and vertical scaling. But let’s not get into those details now.

Scalability (or Scaling Out), then, needs to be considered on three levels.

  • Can the (whole) system Scale Out in order to handle more commissioned work — meaning an increase of Throughput and Data Volume?
  • How Cost Effective is the system?
  • Given that the system can Scale Out, and given that there is a Processing Time SLA, what’s the Cost Effectiveness of the system?

Note that just as there’s an SLA for Processing Time, there should be an SLA or a goal for Cost Effectiveness.

What’s colloquially understood by Scalability is the assumption that the system can Scale Out (to a certain extent), and that it can meet the Processing Time SLA goal. In most cases, Cost Effectiveness is skipped.

To summarize, the concepts we know so far: Processing Time, Throughput, Data Volume, Resources, Processing Time SLA, Cost Effectiveness Goal or SLA, and Cost Effectiveness all in the context of Performance & Scalability.

There. With that we should be ready to jump into more in depth stuff, which will be covered in the next posts.

Posted by

Paweł Niżnik

Share this article