small medium large xlarge

Automating Screencasts

There’s no longer any reason for screencasts to be hard to create or boring to watch.

by Jason Huggins

Generic image illustrating the article
  Are you ready for your closeup?  

One simple question underlies all software testing: “Does the app work?” Answer that question, and you’re done. Simple, right?

Five years ago, I created Selenium as an answer to that “simple” question. Selenium is a toolkit for testing the functionality of web applications. I use it to write scripts that click and type their way through an online process just as a user would. Selenium lets me watch all the tests run in real time, in a real web browser, giving me the confidence that everything truly is working as a user would expect it to. Automating all these tests lets me push new code to production with a reasonable amount of confidence that I won’t be embarrassed with breakages in features that I thought were working yesterday.

Scaling Up

For the past three years, I’ve worked on the scaling problems that appear when a project has many Selenium tests to run. The common complaint is that a large test suite is too slow—although the definition of “too slow” varies widely. Some cry foul after two minutes, others after two hours. The pragmatic solution is to throw more hardware at the problem. If you have 60 tests, instead of running all the tests in sequence on one machine, find 60 machines and run all the tests in parallel, one test per machine. With this new parallelism, you get the results in one minute instead of one hour—a huge boost in productivity.

Well, maybe. But parallelism comes with its own unique problem—finding and maintaining those 60 machines is not easy. This is a problem that cloud computing services, like Amazon EC2, elegantly solve. As a test engineer at Google, I worked on a similar solution on the “Selenium Farm,” providing access to a large number of machines for the Google Apps and Gmail teams to run their tests in the fastest way possible. And I’m now applying those same techniques in my own startup, Sauce Labs, with the goal of enabling all developers to not have to worry about scaling or maintaining their own cloud of test machines.

So let’s see, you’ve got that confidence-boosting assurance that comes with automated functional testing, and you are getting your test results fast. You now have no more software testing problems to solve, right?

Well, there is a problem in software testing that not enough people are consciously aware of. Software tests, especially the log files and reports they generate, are boring. How lame is it to run a suite of 60 tests that ultimately creates a plain-text log report saying “60/60 tests passed. OK.” Yawn. Instead of that, do something cool with those tests, and get people excited about your software.

The Cure for Boredom

Screencasting is one such technique for getting people excited about software. Screencasting was popularized by David Heinemeier Hansson in 2004, when he created a screencast [1] for his new web framework, Ruby on Rails. Scads of developers have watched this video, where he creates a complete database-backed blog in less than 15 minutes. There were many innovations revealed in Rails at the time, but using a screencast to market software was something no one had done so effectively before. In the wake of Rails’ success, screencasting has gone from a novel nice-to-have to de rigueur software practice.

Screencasting as a software practice is following an adoption path similar to that for test automation. People are creating screencasts to show off their apps today, but screencasting is still mostly a manual activity done with expensive tools by a few people late in the development cycle. Just as test automation became more common as the tools got more capable, less expensive, and easier to use, I expect the same thing to happen with screencasting.

I hope to see software testing move in the direction of the way Apple markets the iPhone. An iPhone commercial is a short, 30-second movie showing off the phone just doing its thing. [2] Open an app, type, click, do something, then exit—that’s it. And that’s all test automation is, too. But as developers, we don’t dream of transforming our tests into miniature Hollywood screenplays for producing snazzy 30-second commercials showing off our cool applications.

I think we should.

Automating Screencasts

If you were to combine user interface test automation tools (like Selenium) with screencasting software, you could automate your screencast videos. Each time your code changed, so could the video demos you create to show off each feature. As each screencast is combined with a test framework, screencasting can become an everyday activity, available to the entire development team.

Let’s take a look at how they do it in Hollywood.

On a typical Hollywood film set, the day’s raw film footage, aptly called “dailies,” is gathered up for viewing. Dailies primarily function as an aid to directors, so they can make sure they got the right shot. But they also have other benefits, as Wikipedia explains:

“Dailies are also often viewed separately by producers or movie studio executives who are not directly involved in day-to-day production but seek assurance that the film being produced meets the expectations they had when they invested in the project.” [3]

Similarly, with automated video clips showing off all the features in your app, your software project’s sponsors (e.g. your boss, your investors, and your customers), like those movie studio executives, will be happy to see your progress even when they’re not “on the set.”

Great Minds Think Alike

I’m not the only one to think about the implications of combining automation and screencasting. Jon Udell, who coined the word “screencast,” [4] wrote in a 2007 blog post titled “From screencasting to automation” that he sees a huge benefit in having a screenplay script to go along with screencasts:

“Today, I can share software-related task knowledge in a social manner by creating and posting screencasts. But you can only watch a screencast. If I could instead share that task knowledge in the form of standardized high-level scripts, you wouldn’t need to watch the screencast. Of course, you might want to, for other reasons, but not simply to get the procedural knowledge transferred from my brain and fingers to yours.

“Given how popular screencasts have become in three years, I’ve got a hunch that taking things to that next level would be huge.” [5]

I’ve found several independent implementations of the automated screencast idea. In 2006, Manfred Stienstra, a web developer at Fingertips in the Netherlands, used his “Screenager” [6] to show off some features in Rails’ ActiveSupport::Multibyte library in a simulated interactive Ruby session. The most complete implementation of the idea was created in 2008, by Joseph Pearson at Inventive Labs in Australia. Called “Castanaut” [7], it combined Ruby scripting, screencasting software iShowU, and Mac OS X’s built-in Text-to-Speech engine to add voice-over narration. A sample movie [8] and the screenplay [9] used to generate the movie are available online.

From my own research, I concluded that having an easy-to-use, cross-platform, open source screen recording library was one of the key missing pieces for easy screencast automation. So earlier this year, I released “Castro” [10], a small fork of another library called pyvnc2swf [11] by Yusuke Shinyama at New York University. Pyvnc2swf acts as a remote desktop client, and works by connecting to a running VNC server. Instead of showing the remote desktop to your screen, Pyvnc2swf records the screen to a flash swf file or the more popular flv flash video format. The specific improvement Castro brings to Pyvnc2swf is the ability to start and stop recording programmatically via a simple Python API. Castro also includes cleanup routines needed for posting videos online on your own site in Flash video players like FlowPlayer or JW Player. In the future, I’d like to add more APIs for mixing in music, narration using text-to-speech engines, subtitles, and other demo-specific features.

Here’s How It Works

Armed with Castro, you have everything you need to create your own automated screencast. It can be as simple as:

  1. Install and launch a vncserver. (Hint: Google it.)

  2. $ [sudo] easy_install castro

  3. Write a simple Python script:

 from castro import Castro
 c = Castro()
 # Do something awesome!

Of course, you would replace “# Do something awesome!” with your own code, which could be a Selenium test you want to record.

Take a Look

For this article, I wrote a test script that shows how you can convert a simple Google search into an automated screencast. The test script is written using the Selenium 2.0 API. In Selenium 2.0, the WebDriver project is merging with the Selenium project.

The video can be viewed online, as can the screenplay that generated the video. I still had to manually upload the video to Vimeo. Perhaps you can help me improve Castro to automatically upload to video services like YouTube and Vimeo?

At Sauce Labs, we use Castro to record a video of every Selenium test run in our cloud. (And we’ve now recorded hundreds of thousands of tests in production.) We’re just at the beginning of what we can do in this space, but already, customers have embedded these video results into their continuous integration systems. When the tests fail, the videos serve as debugging aids. When the tests pass, the videos give visual confirmation the features work as designed.

With tools like Selenium and Castro, plus a little imagination, you now have the power to transform “plain old tests” into something far more exciting and even more useful. And from now on, remember, you’re not a “tester” anymore, you’re a “screenwriter.” If screencast automation catches on, they’ll need to create a new Emmy Award for test automation.


Jason Huggins co-founded Sauce Labs and currently leads product direction. Prior to Sauce Labs, Jason was a Test Engineer at Google where he supported the grid-scale “Selenium Farm” for testing Google applications such as Gmail and Google Docs. Jason’s experience also includes time at ThoughtWorks in Chicago as a software developer. While at ThoughtWorks, Jason created the Selenium testing framework out of the need to cross-browser test a new in-house time and expense system. When not programming in Python or JavaScript, Jason enjoys hacking on Arduino-based electronics projects. Jason has spent time in New York City, LA, and the Bay Area, but Chicago is his kind of town.