Jim shows how to write methods with built-in defenses that allow you to quickly find the fault underlying a failure.
Chess grandmaster Vladimir Kramnik famously used the Berlin Defense in his victory over Gary Kasparov in 2000. Such standard chess openings employ a combination of moves that strengthen a player’s defensive posture and provide a sound foundation for the attack. Novice players often neglect such defensive techniques, a failing immediately apparent to more experienced opponents. As programmers, we too can neglect our defenses, allowing defects to gain the upper hand. And defects wreak havoc: effort is devoted to eliminating errors rather than creating new functionality, schedules slip, projects are canceled. No opponent shows less mercy.
However, several mutually supportive software development techniques, employed systematically, much like a standard opening in chess, will shore up our defenses.
You’re likely familiar with many of these techniques already; they include coding idioms, guidelines and processes:
Boolean Return Values for Methods
Single Exit Points for Methods
Exception Containment (where exceptions are contained within method boundaries and provide useful point-of-failure diagnostics)
Logging (with descriptive error messages and suggestions for remediation)
Design by Contract
Method Preambles—source code comments specifying method interfaces that allow automatic generation of documentation
Unit Testing and Test Driven Development
I’ll illustrate how you can weave these techniques together to form a defense-in-depth against defects that I call the Pragmatic Defense (apropos, as several techniques are drawn from The Pragmatic Programmer).Your defenses will be in place as you write your code, helping you keep defects at bay—even in the most stressful of circumstances.
Some time ago, during the last leg of an all day job interview, my potential employer assigned a programming problem in C++ to parse an English language text file and report various statistical information on its content. With the clock ticking, I attacked the problem—and, under the watchful eye of the examiner, ignored my defenses, making a subtle, and fatal, coding error. I could see the failure in the program’s output when running initial tests and began to hunt for the fault. I backtracked, stepping through the code. Nothing. The examiner interrupted to deliver the 15 minute warning. Still nothing. The distance from failure to fault was too great; I couldn’t isolate and remove the defect in time, ending an excellent opportunity. Could employing the Pragmatic Defense have prevented this? More importantly, can it help your software development efforts? Yes—by assisting you in quickly isolating the defects that underlie observed failures.
Let’s look at each of the constituent techniques of the Pragmatic Defense and how they work together.
Boolean Return Value and Single Exit Point
We’ll start with the simplest techniques—Boolean Return Value and Single Exit Point—and apply them to a C++ method. Here’s an outline of our C++ method.
With the Boolean Return Value idiom, a function either works or it doesn’t. You may remember watching the tense scenes in NASA’s Mission Control (or perhaps have watched the movie Apollo 13) when Gene Krantz queries the flight controllers for a “Go, No Go for Launch” [Krantz, Gene. Failure is Not an Option, Simon & Schuster, 2009, p. 180]. Either all subsystems are “Go,” or the launch does not take place (essentially a logical AND over the states of all subsystems). No equivocation, no integer error codes, no complex decisions to make. Each subsystem is either ready or it isn’t. Each controller reports either “Go” or “No Go.” Similarly, you know what’s working and what isn’t within your method, every step of the way. And you know unequivocally whether your method worked or not. Very simple.
While the Boolean Return Value idiom simplifies determining the status of your method, the Single Exit Point idiom simplifies its control flow; the Boolean value is returned from only one point in your method, both in the nominal case and in all error cases. In the nominal case, control flows to the end of the end of the try block, where the return code is set to true to indicate success, then to the return statement. In case of errors, the method still exits from the same point but only after reporting the error in a catch block.
With the Boolean Return Value and Single Exit Point idioms, you’ve laid the groundwork for point-of-failure diagnostics, now its time to start emplacing defenses using Contained Exception Handling.
The Contained Exception Handling idiom catches all exceptions within the scope of the method, as indicated by the throw() suffix in our C++ method declaration example. A try/catch block encapsulates the body of the method; the exception parameter of the last catch handler, catch(...), catches any exception not caught by any previous catch handlers. Any exceptions thrown within method goLaunch are caught within the method and reported or logged within the method. The method exits from the single return statement at the bottom; if an exception was thrown, the method returns false.
Exceptions have the undesirable characteristics of goto statements [Dijkstra, Edsger. “Go To Statement Considered Harmful,” Communications of the ACM 11 (3), March 1968, pp. 147-148.], and exceptions propagated outside of method boundaries are like long jumps—a type of goto statement considered particularly harmful. Disaster may ensue if exceptions propagated beyond the scope of a method remain uncaught [Sutter, Herb and Andrei Alexandrescu. C++ Coding Standards: 101 Rules, Guidelines and Best Practices, Upper Saddle River: Pearson Education, Inc., 2005, pp. 114-115.]. In the Pragmatic Defense, exceptions are always caught within the method where they are thrown. No exceptions.
Logging exceptions within the functions where they occur reduces the software “distance to fault”—a phrase borrowed from transmission line service that indicates the distance to a physical discontinuity in a cable from a diagnostic measuring device. (Such devices are based on Frequency Domain Reflectometry.) In software, the “distance to fault” represents the intellectual effort required to find the underlying fault or “bug” from an observed failure. The closer the reported failure is to the actual fault, the faster the fault can be isolated. To minimize this distance, exceptions include descriptions of the failure as well as failure location information, such as file name, line number, thread identifier, process identifier, time, and a stack trace. (Some exception handlers even obtain a screen capture at the time the exception was thrown and a mini-core dump.)
The custom exception class, CExcept, provides the time, process ID, stack trace, and other diagnostic information in the exception description returned by its what() member function. The exception’s what() method is used by the error logging code within the catch handler. Here, errors are logged using the Apache log4cxx logging framework. (Similar frameworks are available for other languages, such as log4j for Java.) For basic console applications, you may wish to emit the error messages to the standard error using std::cerr.
Error messages should be as descriptive as possible, indicating not only the source of the failure but its possible cause and any corrective actions feasible.
Boolean Return Value, Single Exit Point, Contained Exception Handling ,and Logging let you know that a failure has occurred and that the fault is present in the execution path in or before the current method. Design by Contract will allow you to further narrow down the location of the fault.
Design by Contract
A legal contract defines your rights and responsibilities, as well as those of the other party to the contract. Design by Contract [Meyer, Bertrand. Object-oriented Software Construction, 2nd ed., Upper Saddle River: Prentice Hall, 1997, pp. 331-438.] applies this concept to software modules. Rights and responsibilities are expressed as preconditions and postconditions (and also invariant conditions, which we won’t cover here). Preconditions specify entry criteria that must be true to start a procedure. Postconditions specify criteria that must be true upon successful completion of a procedure.
Often, preconditions are limits on the valid ranges of input values, but not necessarily. For example, preconditions may include conditions such as:
A particular process must be running
A socket connection must have been established
A file must exist
An environment variable must be set
These could all be postconditions as well. For example, if a function is supposed to create a file, that the file must exist is a postcondition of that function.
Some languages, such as Eiffel, provide native support for expressing pre- and postconditions. C++ does not, but you can check preconditions before entering the method body. For example, let’s add an input value and impose a precondition for invoking the launch method. There are many criteria that must be met prior to launching a rocket; one is the distance to lightning sources.
If the precondition is not satisfied, there no point in continuing; it makes no sense to call the remainder of the method (it’s not safe to launch the rocket). An exception is thrown, the failure is logged and the goLaunch method returns false. (A C++ Standard Template Library output string stream is used to construct the exception description.) Reporting the precondition failure helps to isolate the fault. The failure of a precondition may not tell you where the fault is, but it does tell you where it isn’t. There’s no need to spend time debugging the goLaunch method; the problem occurred prior to invoking goLaunch, perhaps in the software’s weather subsystem.
Postconditions, on the other hand, tell you where the problem is when a failure occurs. A postcondition is a condition on a method that will become true when the method completes; a postcondition is what the method is guaranteed to do. If all of the method’s preconditions are satisfied, and the method’s postconditions are not met, the fault lies in this method, nowhere else.
Let’s examine a simple C++ method, takeNote, which adds a note string to a collection of notes. It is supposed to add exactly one note string, but due to a copy and paste error (you’re under pressure), a second push_back method was invoked on the vector of strings in the method body, appending two identical notes. Not the required behavior. The postcondition check will report a failure. The size of the vector on entry to the method is recorded in the integral value ulSizeOld. After the body of the method completes the postcondition is checked; the size of the vector is compared with its original size plus one. Due to the fault, this postcondition will fail; the logging system reports the failure. The fault is easy to find—it’s in this method; the “distance to fault” is short.
(You can download the source code for this and other example programs.)
Pre- and postcondition expressions must be free of side effects. If performance is an issue, consider placing the pre- and postconditions within conditional compilation directives, and use them only during testing.
Also, if the pre- and postcondition checks distract you from the essentials of the method, an editor supporting code folding will help to minimize their visual footprint. Alternatively, you can place the checks in separate methods; for example, in C++, precondition checks may be placed within a pure virtual method within an abstract base class that defines an interface.
You would then call the base class method in your implementation to check preconditions, as shown in the following code fragments.
The CNotes class implements the INotes interface; it overrides the takeNote method and calls the INotes::takeNote method to check preconditions.
But how do you know what the preconditions and postconditions for your method should be? You first need to specify what the method is supposed to do and what inputs it needs, and that information is placed in the method’s preamble.
Interface Documentation—Method Preambles
Every method has a signature, which defines its interface. But the signature alone is not enough. A method must be documented sufficiently so that there is no need for its users to examine its internal implementation in order to use it correctly. When you write the interface documentation for the method, you’re specifying what the method must do, in effect defining its requirements. Such specifications are called preambles and are placed in the C++ header file in which the method is declared. As its author, you’re the first user of this method and the preamble provides the specification for your implementation, so write the preamble before you implement the method body. The preamble for method takeNote is shown below, written to support the Doxygen documentation generation tool.
Doxygen supports a number of languages, including C++, Java, and Python; it will generate documentation in HTML, allowing convenient browsing, a great convenience for future users of your method.
As you implement your method in accordance with its preamble, you’d probably like to conveniently exercise your precondition and postcondition checks. How can you do this? Through unit testing.
Your pre- and postcondition checks can be exercised using unit tests. A wide variety of unit test suites conforming to the xUnit architecture exists for major languages, including JUnit for Java, NUnit for C#, and CppUnit for C++. In the example below, I use the Google C++ Testing Framework.
The Test Driven Development (TDD) agile process requires that automated tests be written before the code under test; however, many developers find this difficult in practice. Fortunately, the pre- and postconditions, as well as the preamble, make writing unit tests straightforward. You can write an initial unit test exercising a nominal test case as soon as the preamble to the method is done and the basic outline of the method has been implemented. Your initial test can simply invoke the method and check the return code as shown below.
The output of the test indicated that it failed its postcondition checks and did not return the expected value (true).
The nominal test case can be used to exercise the code as you fill in the body of the method. Unit tests and the method body can be written iteratively; for example, preconditions can be implemented in the method and then unit tests can be written to ensure that the preconditions detect invalid input; execute the unit test and make sure the expected exception is thrown. The postconditions essentially provide a built-in unit test on the output of the method; there is no need to duplicate the postcondition checks in the unit test code.
Once the body of your method is complete, you’ve coded the postconditions, and you have tests for the nominal test case as well as tests for checking the preconditions, write boundary condition tests, worst case tests, and special case tests [Jorgensen, Paul. C. Software Testing: a Craftsman’s Approach, 2nd ed., Boca Raton: CRC Press, 2002.]. This diverges from TDD orthodoxy, but many special test cases do not become apparent until the body of the method is well along.
You now know all the individual techniques in the Pragmatic Defense. But how do you employ them systematically, especially when the pressure is on?
Damage Control Procedures
The crew of the USS Cole fought throughout the day to control flooding resulting from a massive explosive detonated by terrorists against the ship’s port side as it refueled in the Yemeni harbor of Aden in October of 2000. Despite the forty-foot long gash in the hull, by evening the crew had the flooding and other damage under control, saving the ship. How did they succeed against such odds? Damage control is practiced often in the U.S. Navy, using sophisticated simulators; fires are extinguished, masks donned, holes plugged, structures shored up. Standard processes are followed, familiar equipment is used; and that equipment is immediately at hand and always ready for use. All this so that when a vessel is in extremis, the crew will know exactly what to do, automatically. It works.
As a programmer you must often implement software when in extremis. The circumstances won’t be as dire as those faced by the Cole’s crew, of course, but fatigue, lack of sleep, and even burnout can diminish your ability to write correct code. In these circumstances, your defensive tools need to be at the ready. You also need a tried and true process to follow.
First, ensure that your project’s unit test framework, logging framework, documentation tools, and exception handling are in place. Then, follow these steps when writing a method:
Write the method signature
Write the preamble to the method (in a form compatible with Doxygen or similar tools) in the C++ header file
Implement the outline of the method (outer try/catch block, single exit point with Boolean return value, logging) or paste a standard skeleton of the method body
Do not assign the return value to true at the end of the try block as yet
Write a nominal TDD unit test case to exercise the method
Begin writing the method—iteratively.
Insert tests for preconditions
Write the method body; check return codes of invoked methods and throw exceptions when errors are encountered
When the method is far enough along, write the postconditions. (The method body need not be complete when you write the postconditions but the preconditions must be.)
Execute the nominal test case during development, review any failures in the code and correct them
When a first cut of the method is ready, assign the return code to true at the end of the outermost try/catch block
Add other test cases—invalid preconditions, worst case, boundary value, and special case tests
Shore up the method until all tests pass
The Pragmatic Defense won’t prevent you from making errors, just as the U.S. Navy’s damage control procedures can’t prevent ships from being hit. However, the Pragmatic Defense will help you to find those errors and remove them; it will reduce the “distance to fault.” You’ll still inject defects, but you won’t deliver them. Had I used the Pragmatic Defense while taking the programming test during my job interview, I might still have made that subtle coding error, but I would have been able to find it. Quickly. I implemented a solution to the programming test again, this time using the Pragmatic Defense. That subtle error? Checkmated.
The Berlin Defense gained a great deal of popularity after Kramnik used it in his victory over Kasparov. Perhaps the Pragmatic Defense will gain notoriety on your projects; it’s a great opening move.
Jim Bonang has spent over twenty years developing document management systems, medical device software, command and control systems, mission-planning systems, satellite terminals, compilers and other software systems that mustn’t fail. He has also found himself in U.S. Navy damage control simulators on more than one occasion. Jim can be reached at email@example.com. Send the author your feedback or discuss the article in the magazine forum.