Understand the basics of design verification. Gain exposure to some of the more straightforward verification techniques. Comprehend issues in program verification.
Read chapter 12 of the textbook
Write report R5, final report
Section twelve of the text, on design verification, is frankly rather dense. It covers various "design verification" methods, including loop verification, state machine verification, symbolic execution, proof by induction, trace and execution tables, and formal verification. With all these tools (and most of them seem to be very labor-intensive!), it's possible to spend lots of time verifying a design, and I was glad to see Humphrey's advice to base verification methods on defect experience ("If you do not have problems with repeat-until loops, for example, there would be little point in spending additional time verifying them" [Humphrey95]).
This business of design verification speaks to me-- but I wish it would speak more loudly and plainly; I'm having trouble internalizing the practical use of these methods. I understand that "whenever you design complex logic, you are likely to make errors"-- what's less clear to me is how cost-effective these methods are. I also feel that Humphrey spends a great deal of time on how to design a method (trace/execution tables, etc), but not so much on how to design a system, which (in my opinion) is extremely important as programming projects scale up.
The problem of scale is addressed very briefly by Humphrey, who delivers the following bad news:
"...all current verification methods... are time consuming. As programs get larger and more complex, verification methods also get more complex. Thus, as your need to use these methods increases they are progressively less usable." | ||
--[Humphrey95] |
A brief rundown of Humphrey's listed methods follows:
Symbolic execution is given a few short paragraphs in the text, and seems to be a simple stepwise execution of a routine's logic, with possible dummy values for each of the variables. It can be used in fairly simple routines to test basic logic; Humphrey shows an example of a routine which shuffles the contents of three variables and an "execution table" showing the instructions on the left and the contents of each variable as each instruction is executed. This could indeed be handy to check what's happening to variables over time. Unfortunately, it's not clear how to use symbolic execution with looping/branching structures.
If symbolic execution is very useful for monitoring the contents of variables, etc., it is interesting to wonder how useful stateless/variableless languages (such as Haskell, etc.) are in simplifying logic-- there are no variables to lose the contents of. Of course, the logic is generally still there, but it's an interesting thought.
Proof by induction extends a single case into a general rule, and is used often in mathematics, and according to Humphrey can be applied
"...to logical expressions with integer parameters. It can be stated as follows: 1. If an expression f(n) is true for some integer value n0 2. and if when n > n0, f(n) true implies f( n + 1 ) is also true, 3. then f(k) is true for all k >= n0 | ||
--[Humphrey95] |
I can see how this could be a big help in some areas, proving an algorithm correct for all numbers. But again, how to apply this in everyday routines becomes more complex indeed.
Object state machine verification is the check that:
The state machine has no hidden traps or loops (it cannot get stuck in an endless loop and never reach a return state).
It is complete (all possible states have been identified, for every combination of the attribute values).
It is orthogonal (for every set of conditions there is one and only one possible state).
The transitions are complete and orthogonal (from every state, a unique next state is defined for every possible combination of input values).
Unfortunately, the text (while extremely complete in its listed examples) is a little more distant when it comes to state machine design. When should you list explicit states? When is a state machine a good choice? Admittedly, this section is primarily concerned with verifying an existing state machine design rather than the theory of using state machines, but it could use some improvement.
Execution tables check a run through a routine with a given set of parameters, checking the behaviour of the logic as well as variable values throughout. I'm not clear at all the difference between execution tables and symbolic execution, except that execution tables are executed with sample values for each variable instead of simple symbols as placeholders. This could be useful for conditional logic; while symbolic execution will trace transformational logic (variable swaps, calculations, etc.) it has a hard time dealing with conditionals or loops. Execution tables seem to be basically the same as stepping through code with a debugger-- only without the debugger. This seems very labor-intensive, although I recognize it as the same process I use when staring at code and running through it "in my head"-- a process which usually skims over errors because it's not meticulous. Doing this on paper would help, although the process takes a great deal of time.
"Trace tables," writes Humphrey, "often are more efficient for verifying logic correctness than are execution tables." I'm not sure I understand that. Trace tables seem to be a combination of symbolic execution, examination of all possible cases, and proof by induction.
Unfortunately, Humphrey does a confusing job of presentation, and here trace tables seem to be almost identical to execution tables (which seemed almost identical to symbolic execution!). What I glean from his discussion is that you identify all possible paths through the logic, building an abstract test case from each, and combining them (his example uses a whitespace stripper,, which is passed a string containing 0, 1, or j leading spaces, 0, 1, or m message characters, and 0, 1, or n trailing spaces, where j, m, and n are any positive integer values. This leads to four possible combinations in each variable (you must test for 0, 1, the arbitrary value q, and the arbitrary value q+1, the latter two so that you can use proof by induction). That's a fair number of test cases for a fairly simple function-- Humphrey does 27 test cases (ignoring the q+1 cases, which are handled for each case involving j, m, or n, this is 33 cases, or 27). Of course, some of the logic could be modularized, reducing the combinatorial test cases.
You then run through each of the possible test cases using an execution table, using proof by induction where possible to extend the case to all possible values. It's a torturous path, but indeed provides good, solid verification of a function. This is one which I could have used quite a bit in program 10a (as it turns out, this chapter is usually read before completion of program 10a; my fault for doing things out of order).
These methods take time-- rather a lot of it. I do think that this sort of thought would have helped me reduce my defect count in program 10a, for example. But by how much? It's difficult to tell, and even more difficult to tell if the results would be cost-effective. In any case, they are good tools to add to the designer's toolbox, and I'm glad to have them.