My favorite characterization of good writing is from Rust Hills, former fiction editor of Esquire magazine: “clarity of perception and expression.” Hew Wolff is talking about something similar here when he characterizes good code as truthful.
What is good code, anyway? I recently came up with a new take on this.
I took over some code from a former colleague. It was a Java Web application, written with minimal supervision, and it had a distinct and meticulous style. Tricky places, such as a workaround for a library bug, were carefully documented. Whitespace followed strict conventions. He used a standard set of abbreviations in naming database tables—but these names were long and hard to read. There was a homegrown database persistence layer, which was impressive, but it was also probably unnecessary and certainly confusing. Many methods looked suspiciously similar to each other. In short, it was both strong and flawed.
The code was so good in some ways, and so bad in others, that I started debating with my colleague in my mind over the problems in his approach. And I wondered, were these merely pet peeves of mine, or something more universal?
I’ve read many descriptions of good code, which I will boil down to these:
It’s powerful, elegant, and compact.
It’s readable: it’s a pleasure to walk through it, its main names map closely to the problem domain, and it doesn’t have a distracting burden of terminology or comments.
It’s testable: it’s divided up nicely into modules with unit tests, and the production code and tests are responsive, so you can see it working.
Obviously there’s some overlap between these criteria. For example, pretty much everyone hates flagrantly repetitive copy-and-paste code... or worse, big chunks of code that are almost the same but subtly different. Such code violates all three of these criteria. Repetition reduces power by adding bloat. It hurts readability by doubling the burden on the reader. And it undermines testing, although indirectly, when the maintainer fixes a bug in one location but misses the other one.
But why, on a deeper level, is repetitive code bad? Here’s an idea I haven’t heard expressed: it’s bad because it’s dishonest. It violates a principle of good communication, that whatever you say should be salient. Adding more text that doesn’t add more content is misleading the reader. I think it’s precisely this misleading quality that prevents such code from being powerful, readable, and testable. Good code tells the truth.
I found this idea illuminating. It’s a different angle on the familiar Don’t Repeat Yourself principle. Here are some more examples of this idea of truth in coding.
We tend to agree that a function should be short. Maybe one or two pages long, but you don’t want to keep paging up and down to read it. Why is that? Here’s one possible answer: Imagine you’re listening to a speech, an interesting speech, but it’s going on too long. You start to feel that the speaker is not respecting your time. More than that, you feel that if they knew their point, they would get to it. As they wander farther from a resolution, it seems they must be hiding something. So it is with a long function: it only pretends to know what it’s doing. It should tell you what it’s going to do, and then show you that it’s doing it. It should do one thing and do it well. That means brevity. As Seneca said, “Truth hates delay.”
And there’s a larger point here: all the pieces of a program should be small enough to grasp whole. Not only functions but parameter lists, classes, files, directories, and even names. Kent Beck and Martin Fowler, in their discussion of refactoring and “code smells,” describe a large class as “prime breeding ground for duplicated code, chaos, and death.” Fortunately the basic remedy is simple: look for natural fault lines in the big pieces, and break them up. In a classic refactoring example, there’s a chunk of code with the comment “Configure connection,” which is crying out to be a separate function called ConfigureConnection.
For another example of truth in coding, consider namespace pollution. An unnecessary global variable is not just an invitation to bugs; if it’s overexposed, it’s lying. Its scope should be reduced to match its importance.
A more down-to-earth example is the length of variable names. Most of us like to see variables like i, root_node, user_name, and dirty_invoices. But you’ve probably seen code where these variables would be i0, rn, u1, and invdty. You may also have seen code where they would be integer_request_index, the_root_xml_node, string_name_of_user, and array_of_all_dirty_invoices_in_system, which is painful in a different way. Part of telling the truth is gauging how much information to provide; too much or too little description will distract us from what the code is actually doing. We want the whole truth and nothing but the truth.
Finally, we should not forget the problem domain. Maybe you’ve been on a project where the client is always talking about “playlists,” but the programmers call them “playlists,” “song lists,” “sets,” and occasionally “CD tracks.” In this situation we are literally not sure what we’re talking about. To be accurate, we need consistent terminology. If it’s a playlist, call it a playlist. Don’t call it a set. Tell the truth.
I could multiply examples here, but I’d rather let you try out the principle. Think of something that gets on your nerves, and see how the “truthful code” idea fits. Big switch statements? Deep nesting? Magic constants? Premature optimization? Frivolous comments? And think, on the other hand, of things you like to see. The MVC pattern? The const keyword in C++? Table-driven solutions? Unit tests? Class diagrams?
To be clear, I’m not talking here about processes for writing good code, although the discussion above leads naturally into refactoring. I’m not talking about how much time to spend on code quality. I’m not talking about a good user experience, either. My focus is the everyday experience of looking at your code and saying, “It runs, but is it good enough?”
We’re not telling the truth to the machine. The machine doesn’t care what we say, and if we get it wrong, the machine will let us know soon enough. No, we’re talking to programmers: our colleagues and our future selves. Just in case you were tempted to forget: coding style is about talking to people. There’s an obvious analogy with the style of expository writing. For example, just think of Strunk and White’s admonition to “omit needless words.”
A sticky point here is that code quality is subjective. You can optimize for short code, or fast execution, or buzzword compatibility; your elegant solution might strike me as an awkward hack. The safest thing is to stick to a style that everyone on the team can live with. Try to see this not as a corruption of your true vision, but as a correction of your excesses.
Another way to look at this is that our colleagues keep our code honest. When we work in isolation it’s easy to fool ourselves into thinking that our weird idea is great. It’s harder to fool the person next to you. Pair programming tends to produce, not ugly compromises, but code that everyone likes. A democracy of professionals helps get our code closer to the truth.
I mentioned that style in code is like style in prose, but a better analogy is with the style of mathematical proofs, which we can think of as programs running in the collective mind of the mathematical community. There are many valid proofs for the same result, and which one to use is a matter of taste. Many of the choices are familiar to programmers: how to name variables, when to pull out a chunk of logic into its own subtheorem, whether to go for a narrow focus on the current requirements or for “gruesome generality.” Paul Erdös spoke of the “book proof,” the proof you would see if you could peek into the notebook of the Supreme Being.
I think that’s a good attitude for programmers as well. What drives better code is not a laundry list of dos and don’ts but a conviction that somewhere within your grasp is a beautiful solution. There might even be several of them, all so bold and dazzling that you will only achieve a pale imitation. But you keep trying. Telling the truth is an attitude.
Telling the truth may sound boring. But you still get to be clever and creative. It’s just that you do it with discipline, and stay faithful to the requirements. You remember that cleverness to no purpose, or cleverness that no one else can understand, is empty. You are never simply translating the customer’s requests into code. The customer will not ask you to create multiple threads or to use the Factory pattern; that part is your job. If it’s good, it will be (as someone said about good writing) simple but not obvious. It will have personality, but no ego. It will be true.
Personally, I would not propose myself as a superb stylist. I’ve been lazy about cleaning up my code sometimes. And sometimes I’ve spent way too much time polishing. But on a good day, I feel I’m getting close to a true design. And on a bad day, that goal helps keep me going.
Hew Wolff has worked in software since the Reagan administration. Important distractions include math, garlic, and the work of Joss Whedon. He lives with his wife in Oakland, California. See also his very interesting home page.