Category Archives: Testing

Xcode’s Secret Performance Tests

I was inspired today, by a question from another developer, to dig into Xcode’s performance testing. This developer had observed that XCTestCase exposes a property, defaultPerformanceMetrics, whose documentation strongly suggests can be used to add additional performance metrics:

This method returns XCTPerformanceMetric_WallClockTime by default. Subclasses of XCTestCase can override this method to change the behavior of measureBlock:.

If you’re not already familiar, the basic approach to using Xcode’s performance testing infrastructure is you add unit tests to your project that wrap code with instructions to measure performance. From the default unit test template:

func testPerformanceExample() {
	// This is an example of a performance test case.
	self.measure {
		// Put the code you want to measure the time of here.
	}
}

Depending on the application under test, one can imagine all manner of interesting things that might be useful to tabulate during the course of a critical length of code. As mentioned in the documentation, “Wall Clock Time” is the default performance metric. But what else can be measured?

Nothing.

At least, according to any header files, documentation, WWDC presentations, or blunt Googling that I have encountered. There is exactly one publicly documented Xcode performance testing metric, and it’s XCTPerformanceMetric_WallClockTime.

I was curious whether supporting additional, custom performance metrics might be possible but under-documented. To test this theory, I added “beansCounted” to the list of performance metrics returned from my XCTestCase subclass. For some reason I couldn’t get Swift to accept the XCTPerformanceMetric pseudo-type, but it allowed me to override as returning an array of String:

override static func defaultPerformanceMetrics() -> [String] {
	return ["beansCounted"]
}

When I build and test, this fails with a runtime exception “Unknown metric: beansCounted”. The location of an exception like this is a great clue about where to go hunting for information about whether an uknown metric can be made into a known one! If there’s a trick to implementing support for my custom “beansCounted” metric, the answer lies in the method XCTestCase’s “measureMetrics(_: automaticallyStartMeasuring: forBlock:)”, which is where the exception was thrown.

By setting a breakpoint on this method and stepping through the assembly in Xcode, I can watch as the logic unfolds. To simplify what happens: first, a list of allowable metrics is computed, and then the list of desired metrics is iterated. If any metric is not in the list? Bzzt! Throw an exception.

I determined that things are relatively hardcoded such that it’s not trivial to add support for a new metric. I was hoping I could implement some magic methods in my test case, like “startMeasuring_beansCounted” and “stopMeasuring_beansCounted”

but that doesn’t appear to be the case. The performance metrics are supported internally by a private Apple class called XCTPerformanceMetric, and the list of allowable metrics is derived from a few metrics hardcoded in the “measureMetrics…” method:

  • “com.apple.XCTPerformanceMetric_WallClockTime”
  • “com.apple.XCTPerformanceMetric_UserTime”
  • “com.apple.XCTPerformanceMetric_RunTime”
  • “com.apple.XCTPerformanceMetric_SystemTime”

As well as a bunch of others exposed by a private “knownMemoryMetrics” method:

  • “com.apple.XCTPerformanceMetric_TransientVMAllocationsKilobytes”
  • “com.apple.XCTPerformanceMetric_TemporaryHeapAllocationsKilobytes”
  • “com.apple.XCTPerformanceMetric_HighWaterMarkForVMAllocations”
  • “com.apple.XCTPerformanceMetric_TotalHeapAllocationsKilobytes”
  • “com.apple.XCTPerformanceMetric_PersistentVMAllocations”
  • “com.apple.XCTPerformanceMetric_PersistentHeapAllocations”
  • “com.apple.XCTPerformanceMetric_TransientHeapAllocationsKilobytes”
  • “com.apple.XCTPerformanceMetric_PersistentHeapAllocationsNodes”
  • “com.apple.XCTPerformanceMetric_HighWaterMarkForHeapAllocations”
  • “com.apple.XCTPerformanceMetric_TransientHeapAllocationsNodes”

How interesting! There are a lot more metrics defined than the single “wall clock time” exposed by Apple. So, should we use them? Official answer: no way! This is private, unsupported stuff, and can’t be relied upon. Punkass Daniel Jalkut answer? Why not? They’re your tests, and your the only one who will get hurt if they suddenly stop working. In my opinion taking advantage of private, undocumented system behavior for private, internal gain is much different than shipping public software that relies upon such undocumented behaviors.

I modified my unit test subclass to return a custom array of tests based on the discoveries above, just to test a few:

override static func defaultPerformanceMetrics() -> [String] {
	return [XCTPerformanceMetric_WallClockTime, "com.apple.XCTPerformanceMetric_TransientHeapAllocationsKilobytes", "com.apple.XCTPerformanceMetric_PersistentVMAllocations", "com.apple.XCTPerformanceMetric_UserTime"]
}

The tests build and run with no exception. That’s a good sign! But these “secret peformance tests” are only useful if they can be observed and tracked the way the wall clock time can be. How does Xcode hold up? I made my demonstration test purposefully impactful on some metrics:

func testPerformanceExample() {
	self.measure {
		for _ in 1..<100 {
			print("wasting time")
		}
		let _ = malloc(3000)
	}
}

Now when I build and test, look what shows up in the Test navigator’s editor pane:

Screenshot of performance metrics after reducing the size of allocations and length of run.

Look at all those extra columns! And if I click the “Set Baselines…” button, then tweak my function to make it substantially less performant:

func testPerformanceExample() {
	self.measure {
		for _ in 1..<10000 {
			print("wasting time")
		}
		let _ = malloc(300000)
	}
}

Now the columns have noticably larger numbers:

Screenshot of Xcode's test results after running tests with

But more importantly, the test fails:

Screenshot of test errors generated by failing to meet performance baselines.

I already mentioned that by any official standard, you should not take advantage of these secret metrics. They are clearly not supported by Apple, may be inaccurate or have bugs, and could outright stop working at any time. I also said that, in my humble opinion, you should feel free to use them if you can take advantage of them. The fact that they are supported so well in Xcode probably implies that groups internal to Apple are using them and benefiting from them. Your mileage may vary.

The only rule is this: if Apple does do anything to change their behavior, or you otherwise ruin your day by deciding to play with them, you shouldn’t blame Apple, and you can’t blame me!

Enjoy.

Unbuffering Jenkins Output

For years I have used Jenkins to manage all my automated Mac and iOS builds. I have jobs set up to build and release apps “on demand,” as well as integration builds that kick off whenever a change is made to a pertinent source code repository.

For years I have also put up with a small but vexing nuisance that when something goes wrong with a build, I have to search in the most unlikely place for pertinent informational messages printed by my build scripts: at the very bottom of the log.

I had chalked it up for so long to being a nuanced Jenkins bug, that I never took the time to get down to basics and really unwrap what is going on. Today, I did that, and also came up with an easy solution.

I use a custom script to drive the whole process for each of my automated builds. This script is responsible for printing diagnostic information and for invoking other tools (such as xcodebuild) to get the actual work of building, testing, and uploading done.

The problem with the misplaced log lines in my console output has to do with a feature of the scripting language I use to drive the process: standard output buffering. I use Python, but the problem could also affect you if you use a language such as Ruby to drive your automation.

Here’s a contrived example of a build script that will exhibit the problem. If you create a new, empty Jenkins build job, and configure it with a single “Execute Shell” build task:

#!/usr/bin/python 

import os

print "start"
os.system("ls -ld /Applications/TextEdit.app")
print "end"

You will find the generated console output upon “building” this job is demonstrably out of order:

[workspace] $ /usr/bin/python /var/folders/wh/f_vmqvxx34d69c1pm191t0tr0000gq/T/hudson321577740740596707.sh
drwxr-xr-x@ 3 root  wheel  102 Dec 13 14:00 /Applications/TextEdit.app
start
end

The message from the “ls” subprocess prints right away, but Python buffers the output until it’s done running, leading a nonsensical printing of the “start” and “end” messages after the job is completely done.

Now I will make a tiny change, passing the “-u” parameter to the Python invocation in the script. This option is documented to “Force stdin, stdout and stderr to be totally unbuffered”:

#!/usr/bin/python -u

Running the job again in Jenkins produces the expected output in Jenkins:

[workspace] $ /usr/bin/python -u /var/folders/wh/f_vmqvxx34d69c1pm191t0tr0000gq/T/hudson3255283214356026397.sh
start
drwxr-xr-x@ 3 root  wheel  102 Dec 13 14:00 /Applications/TextEdit.app
end

This is a small annoyance in this example, but in a real-world scenario it means that the pertinent xcodebuild invocation, which I print to the console before calling, will now show up where it belongs in the console log, making it much easier for me to reason about the behavior of my automated builds when they aren’t working as they should.

Test With Swift

I have recently passed a sort of tipping point where I’m indulging more and more in Swift for new code that I add to my projects. There are some instances where I will still create a new class in Objective C, primarily where I anticipate the need for dynamic runtime hijinx that might be more complicated in Swift. In general though, I’m opting for Swift. Finally.

There are many reasons to remain gun-shy about Swift, and I don’t fault anybody too much for choosing to continue forestalling the transition. I’ve spoken with many people who are as tentative as I was or moreso. Some of our collective reasons for waiting may sound familiar to you:

Swift …

  • … is not mature.
  • … requires adding bloated libraries to the app.
  • … presents an impedance mismatch with existing Cocoa design patterns.
  • … is still too risky for production code.

I don’t agree with all of these rationale, especially now that I’ve decided to dive in myself. However, they make a good basis for the argument I’d like you to consider: you should write all new unit tests in Swift.

For many of us who spent years developing a vast collection of Objective-C based classes, it does seem daunting to transition to a new language. But unit tests are different from “regular code” in a number of ways that make them a suitable place to start delving into Swift:

Unit tests …

  • Don’t ship to customers.
  • Can be as bloated as you like.
  • Test the exposed interfaces of classes more than the internal design.
  • Are not technically production code.

I’m sure somebody will argue that tests are so vital to the development process, that they are the last place one should invest in risky technology. I guess what I’m urging you to believe is that Swift is no longer risky technology. It’s not longer coming, it’s here. We will serve ourselves well to adopt it as quickly as practical. And those of us who are daunted by the challenge incorporating it into our existing, Objective-C heavy source bases, have a perfect opportunity in unit testing to get our feet wet while establishing a Swift source base that will live on well into the future. After all, your unit tests should, in theory, outlive any specific implementations of your shipping code.