In Apple’s ecosystem, Performance Engineering is often associated with hours spent with Instruments. Time Profiler, Leaks, Allocations, Core Data are all tools that provide us with precious bits of information such as where bottlenecks lie, what uses up system memory, which query takes too much time, etc…
Unfortunately, in my experience (and from bits of information I get from fellow devs), it seems that we reach for Instruments only when we know that something needs to be fixed, or improved. This kind of Performance Tuning should be preceded by proper Performance Testing – and that’s precisely the topic, that I would like to cover today.

Change blindness – what do we miss?

The problem with performance is that it often degrades slowly and gradually. Of course a single, incorrectly designed or implemented feature can basically render whole app unusable, but such problems are usually discovered pretty quickly. If you are working on an application that is being developed and maintained over many years it is completely possible for developers to miss out small performance degradations, that over the time add up to a bigger issue.
There are at least few ways to avoid such ‘change blindness’:

Periodic tests using Instruments

While some people prefer using this method I am not fond of this approach. Performing such tasks manually is error prone, as it may be difficult to define a proper baseline for the results we get (especially if test environment can change over the time). It is also time consuming, which is also bad as we developers hate (and prefer to automate) boring, repeatable tasks.

Automated performance testing

This is definitely something to consider. Please bear in mind that possibilities given to us now are infinitely better than few years ago. While CACurrentMediaTime() or dispatch_benchmark can still be useful, Xcode 6 did bring us much more powerful tool: XCTestCase’s ‘measureMetrics’ functionality.

Using analytics

It is a good idea is to track how our application is actually working for our customers. Are screens being shown quickly? How long user has to wait for app to start up? Is DB migration painfully slow? All of these can be answered by properly tracking and sending events to <insert your favorite’s tool name here> analytics.

Since both first and last points are pretty self-explanatory, let’s focus a little bit on the framework provided by Apple.

Benchmarks automated

On the Web, you can find various blog posts and forum answers explaining how to use XCTestCase API introduced in Xcode 6. In case you haven’t stumbled on it yet, here’s the ‘Measure Metrics 101’.

Performance tests in XCode are just normal unit tests. Benchmarking comes down to one of two methods, that you use inside a typical XCTestCase test method:

- (void)measureBlock:(void (^)(void))block;
- (void)measureMetrics:(NSArray <NSString *> *)metrics automaticallyStartMeasuring:(BOOL)automaticallyStartMeasuring forBlock:(void (^)(void))block;

Each of these methods is supposed to take given block of code and execute it N times, with N being determined by the Xcode. For me, on both Xcode 6 and 7 this corresponds to 10 invocations. While the former method is really simple it’s the latter that gives the most flexibility:

- (void)testSomeCode
{
   // Here setup the test case, this will be run only once.

   [self measureMetrics:[[self class] defaultPerformanceMetrics] automaticallyStartMeasuring:NO forBlock:^{
       // Here setup each *invocation* of the test - it will be invoked once per test run.
       [self startMeasuring];
       // Code to test
       [self stopMeasuring];
       // Cleanup before next invocation
   }];
}

It’s pretty straightforward, if you have project with proper unit tests set up then go ahead and try this!

Analyzing results

Results are presented both graphically and printed to the console.

Performance test results
Test Case '-[Tests testExample]' measured [Time, seconds] average: 0.501, relative standard deviation: 0.034%, 
values: [0.501134, 0.500601, 0.501139, 0.501124, 0.501046, 0.501121, 0.501210, 0.501180, 0.501208, 0.501174], 
performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "Local Baseline", baselineAverage: 0.600, 
maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100
Test Case '-[Tests testExample]' passed (5.315 seconds).

The most interesting parts are ‘average’, ‘values’ and ‘relative standard deviation’. Baseline, is a value that is considered as a reference point. While working on improvements we should typically set it to “last known average” to easily measure results of our work.

Avoiding mistakes

The hard part about writing tests is to ensure that tests themselves are working properly. There are however some pitfalls, typical for performance tests, that we need to be aware of when doing these kinds of tests.

Ensure that you compare only relevant data sets

Obviously, different hardware will return different results. Xcode helps us here, by matching Baseline values with device type. Thanks to it we can focus on our work without worrying about whether we have compared our latest results against valid data set or not.

Ensure that test input is valid

Performance will depend on input data set. Using random data (generated for each test run) as an input is usually not a good idea. Remember that valid tests should return roughly the same average value every time you re-run them over the same, unchanged code base.

Ensure that consecutive invocations do not affect each other

Each single invocation of the test should be ‘independent’ from the others and if consecutive test iterations show high standard deviation, then there is quite a good chance that test itself is incorrect. If deviation is particularly high, Xcode will inform us of this and the test will fail:

[Tests testSomething] : failed: Time standard deviation is 14% (max allowed: 10%).

Executing

Once you have working set of tests the next step is to use them as a part of the development process. It is usually pretty easy to set up automation (for instance using OS X Server), so tests are executed regularly – the only problem may be with the amount of time that is required to run them. Once your set of performance tests grows you will notice that it takes longer and longer to run all of them. The solution for this is to separate functional unit tests and performance tests into independent schemes. This allows you to schedule them separately.

Final thoughts

Performance engineering is not a trivial task. It often requires in depth knowledge, specific to the domain of a particular problem. But no matter how good you are there is no real performance engineering without proper performance testing. Understanding ‘where we are’ should always be a first step of any kind of work related to optimization. With (not so) recent Xcode additions it has become much easier to do so and I encourage everyone to give it a try. After all, fine tuning your code base can be exceptionally fun and rewarding.

 

Posted by

Bartek Waresiak

Share this article