Brian suggest a way to improve static analysis and argues that software teams could learn something from Denny’s when it comes to on-the-job training.
I’m currently helping another group within my company develop a set of coding standards, and introducing them to some of the static analysis tools that my group uses such of Findbugs, PMD, and Checkstyle (and learning about the tools that they use as well).
Along the way a conclusion finally crystallized for me: our current set of static analysis tools is working on the wrong level.
All of the tools parse a class into bytecode, generate an abstract syntax tree, and then analyze that tree. That’s fine for finding a certain class of problems. Where it fails, however, is that it does not operate on the code at the same level that we engineers operate on, namely the text level.
Even the tools and rules that do operate at the text level are very simplistic. Yes, we should limit the length of the lines and length of the methods, and yes, we should limit the cyclomatic complexity of a module. Yet none of that moves us closer to beautiful code—i.e., code that sings. It may seem a ridiculous goal at times but we really can write elegant code that people enjoy reading. Code like that tends to be simple and bug-free.
Learning from PowerPoint
Let’s take an example from another domain. There are lots of style guides available for creating PowerPoint presentations. They suggest rules such as: a) use a readable and standard font, b) limit yourself to 5 or so bullets per slide, c) think about the use of color if you want people to print your slides.
You can follow all of these guidelines and still create a terrible presentation. We’ve all sat through these presentations. The problem is that the style guides are a necessary but not a sufficient set of rules for presentation “goodness.” They do not ensure that the slides say something, that the bullet points at least occasionally include a verb, and that there is some internal consistency and flow to the slides. Grammar checkers can help to some degree, except that we expect PowerPoint bullets to be sentence fragments, so the signal-to-noise ratio gets pretty low.
There is a piece, however, that we can address, both for PowerPoint presentations and for software code. I believe that poor naming contributes far more bugs to code than most of the items specified in most “rules.”
There is a famous quote about code: “you can write code so that it is obvious there are no bugs in it, or you can write code so that there are no obvious bugs in it.” If you see a line of code like windSpeed = todaysWeather.getWindSpeed(), I’m quite confident that you’ll be able to read it and that it is unlikely to have bugs in it. Yet no tool that I’ve found even seems to look at code at the surface level.
Imagine if we built such a tool.
Analyzing Code for Readability
I’m picturing a Spring/Groovy program that would scan a specified directory for source files, and then use reflection to extract the names of the class and methods. These identifiers will then be analyzed for readability.
This will be the admittedly tricky part of the system. We certainly don’t expect to find things like “updateCacheInterval” in a standard dictionary. On the other hand we do have a simple lexical rule to split that identifier into “update,” “Cache,” and “Interval,” all of which are in a standard dictionary. This works on the assumption that camel casing is the standard these days for dealing with longer names. I hardly ever see names like update_cache_interval anymore. This gives us hope that the approach might have some traction behind it, so let’s look at some other cases.
Methods with names like parseXYZParameters are fairly common but can also be handled by rules that look for camel case boundaries plus acronyms. This method might get a score of “two out of three plus one acronym.” A system such as this one will by necessity output a grayscale of goodness, and that’s OK. It can also be enhanced to look for word category. If we follow Knuth’s Literate Programming model, method names should include a verb and objects should be nouns.
We could score a method higher if at least one of the words in the name was a recognized verb. We could go a step further and analyze the parameters to the methods, hopefully finding mostly nouns in their names.
I mentioned that this would be a Spring-based application because of the clear need to be able to inject new algorithms. It would not surprise me to find that there are existing open source libraries for parsing computer identifiers into “words.” My camel casing idea is definitely just a first step. There is also a variety of open source dictionary systems available.
Mike Taylor quoted Bjarne Stroustrup in a recent PragPub article, and I’m going to repeat the quote as it is very applicable: “Design and programming are human activities; forget that and all is lost... There are no ‘cookbook’ methods that can replace intelligence, experience and good taste.”
Interns Need Exposure to Great Code
This shifts us into the question of how we teach young programmers this “intelligence, experience and good taste.” Most professions have some sort of apprentice program where students learn the trade. Doctors have internships; lawyers have clerks and paralegals; architects, electricians all have an apprentice program of some sort. The commonality of all of these is that the student observes the expert being an expert. They are exposed to the best practices of the profession and hopefully absorb them.
Compare that with entry-level jobs or internships in our field. Summer interns at most companies are given little to no training and often work on low-priority tasks like cleaning up a bug backlog. Most companies I’ve worked at have viewed getting an intern as a mixed blessing. You hope to find something discrete for them to work on without slowing down your “real” engineers.
The bottom line is that our interns get exposed to the worst code with minimal or zero supervision from the experts. Try to think of a time when a young programmer would ever get a chance to read great code. There isn’t any such time. And yet, we are then surprised when our new engineers are not imbued with the spirit of creating beautiful code.
We tend to have meetings to discuss problems, but we should really have meetings to walk though excellent code so that people get exposed to examples of what we hope to be producing. Even better would be to adapt the notion of pair programming and let the interns shadow an expert. Way back when I worked a summer job at Denny’s (we’ve all had that job, right?), the newbies spent a week or so shadowing an experienced worker. They didn’t interact with the customers or cook the food, but they watched as the experienced people did.
How many engineers out there ever got the chance to sit with a top programmer and watch them work? Watched them use their tools with precision and dexterity? Saw that they used google as their exo-cortex, and saw them think, then write a test, then write the “real” code?
We want our coworkers to have all of these great habits, but chances are none of them got even the level of on-the-job training that your average waiter at Denny’s got. Perhaps we ought to change that.
Brian Tarbox is a Principal Staff Engineer at Motorola where he designs server side solutions in the Video On Demand space. He writes a blog on the intersection of software design, cognition, and Tai Chi at briantarbox.blogspot.com.