Pretty image
It’s the hot development platform, and anything that speeds the process of getting your iPhone app finished and tested is welcome news. So Ian Dees shows how to drive an iPhone GUI from a Ruby test script.

It’s been a busy year since the first official iPhone SDK arrived. In the land rush to the new platform, developers have found creative ways to test their apps. The SDK ships with debuggers and profilers, of course. But there’s also been a groundswell of home-brewed recipes, from cross-language unit tests to the beginnings of GUI automation.

In this article, we’ll sketch out one path to driving an iPhone GUI from a Ruby test script. If you’d like to try the examples out on your own Leopard machine, the full source code is available. You’ll need the iPhone SDK and simulator, of course, plus a few Ruby libraries:

 $ sudo gem install cucumber rspec tagz

When you have the luxury of a full desktop OS, you can often simulate user interactions like button presses with just a few API calls. With the iPhone, there are some extra steps. It’s worth taking a peek at these low-level tasks, before seeing how to assemble them into a whole test.

Simulating Events

The centerpiece of the iPhone user interface is the touch screen, so we’ll start our exploration there. The iPhone OS passes your app a mix of UIEvent and UITouch objects to represent a tap on the screen, but doesn’t offer direct access to those objects’ details. To simulate a touch event, you need to set up a bunch of private, undocumented fields in just the right way.

Matt Gallagher, a software developer and prolific Cocoa blogger, has blazed this trail for us. He’s added handy setup methods to the relevant Objective-C classes using a category.[1] This took a lot of careful observation and experimentation on his part, so all your automation project needs to do is call one method on his ScriptRunner object:

 [self performTouchInView:someUIObject];

The other half of the equation is interrogating the user interface to find out what happened after that screen tap. Again, Objective-C categories come in handy here. Matt’s fullDescription method, added to every UIView object, returns an XML string representing a user interface element and all its children. (See this article) Here’s what the description of a simple button might look like:

 <UIButton>
  <address>17252816</address>
  <tag>0</tag>
  <currentTitle>Home</currentTitle>
  <frame>
  <x>0.000000</x>
  <y>0.000000</y>
  <width>65.000000</width>
  <height>31.000000</height>
  </frame>
  <subviews>
  <!-- elements inside this button... -->
  </subviews>
 </UIButton>

Matt’s original SelfTesting iPhone app included these two methods—performTouchInView: and fullDescription—and a few convenience methods. Felipe Barreto’s Bromine project project takes these, adds a few more, and wraps it all up in a package designed to be dropped easily into an existing iPhone project.

Driving the App from a Script

How does Bromine know what to do and when? At app launch time, the ScriptRunner object runs a series of instructions from a property list file (parsing this format is a simple library call). These instructions use XPath to describe a GUI object’s location in the hierarchy of user interface elements.

Here’s a simple TestScript.plist that searches the current screen for a UIButton with the title Home, and clicks on it.

 <plist version="1.0">
  <dict>
  <key>command</key>
  <string>simulateTouch</string>
  <key>viewXPath</key>
  <string>//UIButton[currentTitle="Home"]</string>
  </dict>
 </plist>

It’s easy enough to get going with a simple plist file and a few test steps like this one. But let’s consider a twist. We can teach Bromine to listen for instructions while the program is running, rather than reading a static file once.

The sample code for this article includes a modified version of Bromine that accepts test steps handed to it by an embedded web server.[2] This approach frees us from having to stop and restart the app while we’re writing tests. It also lets us drive the app from the command line, using tools like curl:

 $ curl -d @TestScript.plist http://localhost:50000/

Now the doors are open to connect your app to any test harness you like.

Writing Tests in Cucumber

Speaking of test harnesses, it’s worth taking a second to introduce Cucumber, the framework that encourages writing tests in plainspoken language. Imagine we’re building a blogging app. Here’s how we might express a simple task in Cucumber:

 Feature: blog posting
 
  As a blogger
  I want to post from my iPhone
  So that I don't need to drag my laptop everywhere
 
  Scenario: short posts
 
  Given the blog "example" with user "me" and password "secret"
 
  When I add a post entitled "First post!"
  And I add a post entitled "Shark jump"
 
  Then the blog should have the following posts:
  | title |
  | Shark jump |
  | First post! |

It’s plain English, but it’s also runnable code. So you can use scenarios like these to describe how an application will behave, and then later use them as the automated portion of your acceptance tests. Of course, a real project would need lots more test cases. But this one will keep us plenty busy for now.

Trying it Out for Real

Cucumber can serve as a design tool, helping coders and customers agree on features before their product even exists. But for the sake of trying the techniques in this article, it turns out there’s already a real blogging app we can bounce our script off of: the open-source WordPress iPhone client. It’s easy enough to take the stock source code, drop in Bromine to drive the GUI, and add a web server to listen for instructions. You’ll find all the pieces assembled for you in the source code for this article.

So how do we drive the WordPress app from the top-level script? Cucumber uses regular expressions to match each step of the plain-language test to a chunk of Ruby code implementing that step. Here’s what the first of those steps looks like; it clears out the list of blogs, and then adds a new one. Keep in mind, the Blog class doesn’t exist yet; we’ll get to that in a second.

 Given /^the blog "(.*)" with user "(.*)" and password "(.*)"$/ do
  | blog, user, password |
 
  Blog.empty!
 
  Blog.add \
  :host => "#{blog}.wordpress.com",
  :user => user,
  :pass => password
 end

The Blog object is going to construct pieces of XML and use HTTP POST requests to send them to the iPhone. We’ll use the Tagz library to write Ruby code that mirrors the structure of the XML. Here’s how we’d push the Home button from Ruby.

  xml = Tagz.tagz do
  plist_(:version => 1.0) do
  dict_ do
  key_ 'command'
  string_ 'simulateTouch'
  key_ 'viewXPath'
  string_ '//UIButton[currentTitle="Home"]'
  end
  end
 end
 
 # Like Ruby's built-in post_form, but with text
 # instead of form fields.
 Net::HTTP.post_quick 'http://localhost:50000/', xml

Our Blog class members won’t be building XML and sending HTTP directly. Instead, they’ll lean on wrappers that do the grunt work of pushing a button or filling in a text field:

 BlogSettings = '/./descendant::UIButton[1]'
 RemoveBlog = '/./descendant::UIRoundedRectButton'
 ConfirmRemove = '//UIThreePartButton[title="Remove"]'
 
 def Blog.empty!
  count.times do
  press BlogSettings,
  RemoveBlog,
  ConfirmRemove
  end
 end

The rest of Blog’s methods are similar combinations of screen taps and text entry. To keep the pace quick, let’s gloss over those and move on to the final test step:

 Then the blog should have the following posts:
  | title |
  | Shark jump |
  | First post! |

How does Cucumber handle that ASCII-art table of blog posts we’re expecting to see? Here’s the accompanying step definition in Ruby:

 Then /^the blog should have the following posts:$/ do
  | posts_table |
 
  Blog.first.posts.should == posts_table.hashes
 end

For this style of step definition, Cucumber passes the entire table as an array, one item per row, inside the posts_table variable. As with a spreadsheet program, the first row is a guide to the remaining rows. Here’s what the resulting Ruby array looks like for this example:

 [{'title' => 'Shark jump'},
  {'title' => 'First post!'}]

So the Blog#posts method just needs to make sure it returns all the blog posts in the same format, so that the should == expectation can compare them and report a passed or failed result.

Looking ahead

Where do we go from here? There are tons of possibilities.

You’ve probably noticed that the examples here have had a lot of typical real-world robustness measures left out for the sake of brevity. Adding a few checks to make sure we’re on the right screen when the test begins, and switching from fixed delays to smarter waits, are logical next steps.

Another promising direction would be to increase the vocabulary of events we can simulate. There’s a world of input methods out there for the taking, from some of the more automation-resistant text controls to things like the accelerometer.

Finally, it would be nice to expand this demo beyond the simulator and into real hardware. Imagine your build server doing an automated smoke test on a live iPhone after every source code check-in.

I hope that what you’ve seen so far has whetted your appetite to explore further. Please download the source code for this article, try the examples, and maybe even add a test scenario or two. Drop in on the discussion forums if you have any questions or comments. Happy coding!

By day, Ian Dees slings code, tests, and puns at a Portland-area test equipment manufacturer. By night, he dons a cape and keeps watch as Sidekick Man, protecting the city from closet monsters. Ian is the author of Scripted GUI Testing With Ruby, published by the Pragmatic Programmers.

Footnotes

[1]

Objective-C’s civilized take on a monkey patch—a modification to an existing class.

[2]

This project uses cocoahttpserver..