Looking in the rear-view mirror to see how fast you're moving.
So how long is this going to take? This is one of the most common questions to ask or be asked in a traditional software development environment, and one of the most difficult to answer. The agile methodologies, and Scrum in particular, address this problem through the notion of velocity. Whatever your team accomplished in the last time period (sprint) is a good predictor of what they'll be able to accomplish in the next one. Scrum's use of burn down charts, standard definitions of "done" and retrospectives allows for the discovery of a team's average velocity. One can argue about details like hours versus story points, but almost any implementation of Scrum allows for at least tracking how much work was done per unit time.
If you've read Black Swans or The Age of the Unthinkable you may be troubled by our assertion that the past can be used as a predictor of the future. Both books strongly make the case that this is not in fact true. One of the books (I forget just now which one) cites the statistic that something like 80% of the total volatility of the US Stock Market can be accounted for by five individual trading days. They also make the case that grand Five Year Plans are almost always doomed to fail due to unforeseen developments. While all of that is true, and I am a fan of both books, this actually strengthens the case for Scrum's use of historical velocity. Scrum measures velocity sprint by sprint, and each measurement is a data point that can be used to correct assumptions that may be fading in relevance. Scrum assumes the existence of change and embraces it rather than trying to defend against it. And yet, even Scrum acknowledges that most of the time today will be very similar to yesterday and so we can use that data to make some assumptions about tomorrow.
Sadly, many projects are still using "traditional" methods (which seems to be one of the new names for waterfall). One of the drawbacks of such an approach is that the time scales are typically relatively large compared to Scrum, which means there are fewer time periods to measure and thus fewer opportunities for correction. There is simply less data available is you have two six-month buckets versus twelve one-month buckets. This is one reason that "traditional" projects do not typically end up producing a team velocity. This in turns make it substantially challenging to estimate how long the next project will take.
I recently attended the No Fluff, Just Stuff conference and one of the speakers said something striking about velocity (there were so many good talks that I cannot recall who made this particular statement).
He said "I've been developing software for 30 years. I wish that I had written down all of my estimates of how long something would take and then how long it had taken. If I had done that, imagine how accurate my estimates would be by now." I think it's safe to say that almost none of us have done what this speaker regrets not doing. Does that mean we just give up? Of course not. But...
One of the key take-away messages from one of the experts in the field of estimation is not to estimate at all. To quote Steve McConnell: " If you can't count the answer directly, you should count something else and then compute the answer by using some sort of calibration data." As it happens, for our problem of assessing velocity, we do have lots of data that might be helpful. The data in question live in your requirement tracking system, bug database, and source code control system. While these systems are often, shall we say, imprecise, they do contain useful data.
So, if we can't count our velocity directly we can count the number of bugs in our last project, how long they stayed open, how many requirements were present, and so on. Our calibration data is that all of those requirements and bugs took place in the time interval of the project. It's not especially fine-grained but in a waterfall model we are not looking for fine-grained data.
Some may argue that bug reports and requirements are not precisely defined and standardized. What passes as a single requirement for one team might be 3-5 separate requirements for another team. Some teams have bugs like "it doesn't work" while others might enter a dozen or more particular bugs to cover the same underlying defect. Here is the strange thing, though: it doesn't matter.
Just as story points are not standardized, all that matters is that within your team you tend to be consistent. Whatever your team's definition of a story point is if you're doing Scrum, it is likely to represent about the same quantum of work next month that it represented last month. In the waterfall world, the level of granularity you bring to your bug reports is likely to be fairly constant. The point is that we're not looking to equate points or bug counts against anything but other points and other bug counts in the same work environment. So, if your last several projects had 6 requirements, generated 60 bugs, and took 6 months, you have a velocity of 1 requirement and/or 10 bugs per month. If your next project arrives with 15 requirements and a deadline of three months from now we can safely conclude that you are in trouble!
Keep in mind the distinction between accuracy and precision. In the proceeding case we can say with high confidence that you are probably hosed.
All of this will just work better in Scrum than in Waterfall. So if Scrum is so much better why not just use it instead of waterfall? The problem of course is the little matter of changing your company's culture. But while it may take a huge effort to get your organization to move to Scrum, you, as an individual, can go compute your personal velocity right now. Any bug tracking system out there will allow you to gather data about the number of bugs assigned to you and/or fixed by you per unit time. If it were a database query you'd do something like select count(*) from bugs where assignee=="me" group by month(assignedDate). Now, you may object saying "but I do lots of other things besides fix bugs." You would likely be correct, and it still doesn't matter. If a third of your time historically gets absorbed by mind-numbing meetings, what makes you think next month will be any different?
As an individual you might pick from several data sources to establish a velocity. The selection may depend on your job function (sustaining engineer, development engineer) and/or on the systems you are using for bug tracking, source code management, and the like.
You might track the number of bugs fixed by you, the number of source code check-ins you do, the number of code inspections you participated in, or some combination of these and others. One interesting strategy is to gather the data from as many relevant sources as possible and then graph the results for the longest time period for which you have data. If we assume that your actual theoretical velocity has been relatively constant, you can select a data source or sources that results in a flat graph. Of course if you have switched languages or processes or problem domain, your velocity likely has not been constant, so your mileage may vary.
Another point to keep in mind if using the traditional approach is that quite often the project goes through distinct lifecycle phases. I think of these as the EPM, MS Word, Visio, Eclipse, and Jira phases of the project. Many of us are familiar with the pain of spending months in Word and Visio before we get "approval" to go back to the happier world of Eclipse (and then Jira). In these cases it's important to know how to combine your various data streams to get an overall velocity.
Figure 1 shows a somewhat fictionalized analysis of my work in a preceding year. The counts for requirements, defects, and check-ins are raw counts, while the Functional Specification word count is scaled by a factor of 250. Although the graph does bounce around a bit, it also shows an average of about four and a half work units per month. This is even more clear in Figure 2, which shows each month's velocity with standard deviation markers. From this chart it is easy to see that the velocity generally hovers between 4 and 6 work units per month.
Figure 1 Sample Retrospective Velocity
Figure 2 Average Velocity Range
Of course, a side benefit of such changes in velocity is that you can in fact document the change in velocity. In my own organization we have be increasing the mandate for code inspections of all code. While we selected a very lightweight inspection tool (Code Collaborator, highly recommended) the transition to an inspection regime was not without cost. Naturally management would like to see data showing some positive effects caused by this change. If one is measuring velocity, either at the individual or team level, you are well positioned to observe changes in the metrics. We expect to see a decline in field reported bugs for code that has been through the inspection process.
This approach can also be used as support for a stealth project. My coworkers are naturally wary of my new-found fascination with Scala. "There he goes again" has been muttered more than once. This is in fact a natural and reasonable reaction, until I start producing evidence of the benefit of Scala (or any other technology I might introduce to the team). If I present brand new data of a type the team is unfamiliar with, they can be forgiven some skepticism. If however I present the "standard" velocity measurements that we have become accustomed to collecting and analyzing, and if that data shows a delta for Scala-based stealth projects, then my arguments are on much firmer ground.
All discussions of Black Swans aside, whatever last month looked like is a good starting point for guessing what next month might look like. By all means measure each month and look for trends. Perhaps you and/or your organization is getting better and your velocity is increasing. Now we have more confirmation of that fact. Or perhaps in response to missed deadlines you're attending still more mind-numbing meetings to discuss such gems as why things take too long, and so your velocity is decreasing. Cold comfort perhaps, but now you have data to bring to those discussions.
Retrospective Bugs and Code Inspections
While we're talking about looking behind us let's explore another often overlooked opportunity for learning from the past. Even after instituting a policy of near-total code inspection we still of course have field-reported defects. Hopefully fewer, but they still exist. This reminds me of the standard interview question "So, candidate, what is your biggest weakness?" This is a boring question but it can have a powerful followup: "And what are you doing to address this weakness?" In the context of bugs and inspections, the followup question is "So if this code was inspected why didn't the inspection find the defect?" This is not meant to be a rhetorical question!
As the inspection process progresses it become more likely that the code containing any bug will be code that passed through an inspection. Rather than viewing this as a failure of the inspection process, we can view it as an opportunity to enhance that process. After a bug has been resolved, which by definition means we have identified what was wrong, we can and should take the additional step of identifying why the inspection did not catch the defect.
Perhaps the inspectors lacked the subject matter expertise to see a semantic flaw in the code. Perhaps the inspection checklist (be it formal or informal) did not emphasize looking for synchronization issues. Perhaps the inspectors had too much faith in an "expert" coder and so performed just a cursory inspection. Whatever the reason, it should be identifiable and then actionable.
While you might see this as yet another process to slow down development, it can actually provide enormous benefit. When you test or inspect code, you are improving that one section of code. When you perform meta-analysis of your inspection process, you are improving all code that will pass through the inspection process.
We've seen that you don't have to be doing Scrum to get the benefits of knowing your Velocity and tracking how it is changing. And we've seen that looking backwards at your inspections can lower your defect rate. And you can do both without requiring changes or approvals from management. Perhaps it's time to give these ideas a try.
Brian Tarbox is a Distinguished Member of Technical Staff at Motorola in the Systems Engineering group designing next generation video products. He writes a blog on the intersection of software design, cognition, music, and creativity at briantarbox.blogspot.com. His primary contact point is about.me/BrianTarbox .
Heather Mardis is a Senior Engineer at Motorola Mobility in the On Demand Video group, providing release and build infrastructure, tools, and processes. She has a long background in configuration and release management, most recently succeeding in bringing continuous integration and build-automation to the ODV team at Motorola to support their three-continent development needs. Her credits include conference presentations at Atlassian Summit including the "you've got issues" award in 2010. She strives to be the go-to person in ODV, keeping tabs on all the many projects in progress at once and staying one step ahead of what the team might need next.