The Personal Software Process: an Independent Study | ||
---|---|---|
Prev | Chapter 1. Lesson 1: Introduction to the Personal Software Process | Next |
Using a linked list, write a program to calculate the mean and standard deviation of a set of data.
From [Humphrey95]:
Write a program to calculate the mean and standard deviation of a series of n real numbers. The mean is the average of the numbers. The formula for the standard deviation is given in [Humphrey95], and is described on page 508 of the text.
Use a linked list to hold the n numbers for the calculations.
Program 1A Testing: Thoroughly test the program. At least three of the tests should use the data in each of the three columns on the right of the following table:
Table 1-5. Sample data for program 1A: Pascal Object Size and Development Hours
Program Number Object LOC New and Changed LOC Development Hours 1 160 186 15.0 2 591 699 69.9 3 114 132 6.5 4 229 272 22.4 5 230 291 28.4 6 270 331 65.9 7 128 199 19.4 8 1657 1890 198.7 9 624 788 38.8 10 1503 1601 138.2 Standard Dev 572.03 625.63 62.26
The basic problem description does not give us a great deal of direction, which means we're free to work as we see fit. With that in mind, I'll iron out a few requirements ambiguities.
The problem statement does not specify how we get numbers into the program; with that in mind, I'll decide that the program by default accepts a list of ascii-text real numbers from standard input, terminated by an EOF. This will allow both interactive input for testing as well as redirection via standard unix pipes, etc. We will also allow the input to be retrieved from a file using the -f flag on the command line, i.e. psp_1a -f numbers.txt.
The problem statement does not specify what to do in the case of invalid input; since we will be using the program primarily for automated i/o on files, we will treat invalid input as unrecoverable; the program should display an error message with the offending line number and filename, if applicable. Since I am developing under Linux with EMACS, the error message format will be in the standard emacs-style error output of file:line:message.
Without any other guidance than a guess, I'll estimate 1.5 hours (90 minutes) for this program.
The basic design is fairly simple, with a single class (number_list), which inherits from the STL template class list< double >. After subclassing list< double >, we add two features for reading the list from an input stream (read_from_stream and read_entry_from_stream), one feature (add_entry) for simply adding a number to the list (to encapsulate some of the STL functionality), and several read-only constant attributes: sum (the sum of all elements), mean, standard_deviation, and entry_count (to count the number of entries, again just encapsulating some STL functionality)
The code was straightforward, although it went through several iterations in the test phase before turning into anything useful. The completed code is below:
/* */ #ifndef NUMBER_LIST_H #define NUMBER_LIST_H #include <list> #include <iostream> //a class which encapsulates a list of double values, adding the features of //mean and standard deviation class number_list : public list< double > { public: void read_from_stream( istream& input ); void read_entry_from_stream( istream& input, int line_number ); double sum( void ) const; double mean( void ) const; double standard_deviation( void ) const; int entry_count( void ) const; void add_entry( double new_entry ); }; #endif /* */ |
/* */ #include "number_list.h" #include <assert.h> #include <stdlib.h> #include <math.h> void number_list::read_from_stream( istream& input ) { int line_number = 1; while ( !(input.eof()) ) { read_entry_from_stream( input, line_number ); } } void number_list::read_entry_from_stream( istream& input, int line_number ) { double next_entry; const int input_buffer_size = 255; char input_buffer[ input_buffer_size ]; input.getline( input_buffer, input_buffer_size ); //now convert to floating point using strtod char* conversion_end = NULL; next_entry = strtod( input_buffer, &conversion_end ); if ( conversion_end != input_buffer ) { add_entry( next_entry ); } else if ( !(input.eof()) ) { cerr << "(unknown file):" << line_number << ":Error in conversion to double\n"; } } double number_list::sum( void ) const { double Result = 0; for ( list<double>::const_iterator iter = begin(); iter != end(); ++iter ) { Result += *iter; } return Result; } double number_list::mean( void ) const { assert( entry_count() > 0 ); return sum() / entry_count(); } double number_list::standard_deviation( void ) const { assert( entry_count() > 1 ); double sum_of_square_differences = 0; for ( list<double>::const_iterator iter = begin(); iter != end(); ++iter ) { const double this_square_difference = *iter - mean(); sum_of_square_differences += this_square_difference * this_square_difference; } return sqrt( sum_of_square_differences / ( entry_count() - 1 )); } int number_list::entry_count( void ) const { return size(); } void number_list::add_entry( double new_entry ) { push_back( new_entry ); } /* */ |
/* */ #include <fstream> #include <iostream> #include "string.h" #ifndef NUMBER_LIST_H #include "number_list.h" #endif istream * input_stream_from_args( int arg_count, const char** arg_vector ) { istream* Result = NULL; if ( arg_count == 1 ) { Result = &cin; } else { const char* help_text = "\ PSP exercise 1A: Get the mean and standard deviation of a list\n of ASCII real numbers, from either standard input or a text\n file.\n\n\ Usage: \n \ \tpsp_1a\n\n"; cout << help_text; } return Result; } int main( int arg_count, const char** arg_vector ) { //get the input stream, or print the help text as appropriate istream* input_stream = input_stream_from_args( arg_count, arg_vector ); if ( input_stream != NULL ) { //read the entries from the input stream number_list a_list; a_list.read_from_stream( *input_stream ); cout << a_list.entry_count() << " numbers processed.\n"; //output the mean, as appropriate if ( a_list.entry_count() > 0 ) { cout << "Mean: " << a_list.mean() << "\n"; } else { cout << "Too few entries to calculate mean\n"; } //output the standard deviation, as appropriate if ( a_list.entry_count() > 1 ) { cout << "Standard Deviation: " << a_list.standard_deviation() << "\n"; } else { cout << "Too few entries to calculate standard deviation.\n"; } } } /* */ |
The compilation did reveal a few errors, mostly mistyped names and developer omissions (forgetting to include needed header files, etc). Otherwise, no significant problems here.
The design and code sections for this went so easily that I confess I was disappointed by the test results. I made several simple mistakes, but these mistakes formed the bulk of my time spent developing what should have been a very simple program. Some defects were minor and easy to fix-- for example, the number_list::read_entry_from_stream feature added an entry even for an invalid line, even processing the EOF signal as a number. More significantly, the original design called for a naive use of the C++ iostream capability, in other words, usage like this:
while ( !a_stream.eof() ) { a_stream >> new_entry; } |
The only problem with this is that apparently it will never generate an EOF signal at all, a fact of which I wasn't aware. I converted that section to use getline, which worked acceptably.
The most significant problem, though, resulted in some deviations from the original design spec: I was unable to use the standard C++ library stream classes to open a file from disk! In fact, even creating the input stream object resulted in a segment fault interrupt:
ifstream* a_stream = new ifstream() |
In fairness to myself, it has been some time since I've used the standard C++ stream library, but this behaviour did not mesh with my previous experience. A brief experiment with statically allocated streams had similar results when I attempted to check for EOF conditions.
I'll pursue this in the mean time, to check whether or not my unfamiliarity with the library resulted in these problems or if my compiler is simply acting peculiar. C++ is not always well supported in Linux, and I have had problems with their standard library implementation in the past. For the present, though, I will stick with standard input.
The program was tested with three files which contained the sample data from [Humphrey95], and one sample file with errors to check the error-reporting capability.
class NUMBER_LIST inherit LINKED_LIST[DOUBLE]; creation {ANY} make feature {ANY} line_counter: INTEGER; read_from_stream(input: INPUT_STREAM) is do line_counter := 0; from until input.end_of_input loop read_entry_from_stream(input); end; end -- read_from_stream report_error(error_type: STRING) is do std_error.put_string("(unknown file):"); std_error.put_integer(line_counter); std_error.put_string(error_type); std_error.put_new_line; end -- report_error read_entry_from_stream(input: INPUT_STREAM) is do input.read_line; if not input.end_of_input then if input.last_string.is_double then add_last(input.last_string.to_double); elseif input.last_string.is_integer then add_last(input.last_string.to_integer.to_double); else report_error("Not a double"); end; end; end -- read_entry_from_stream sum: DOUBLE is local i: INTEGER; do from i := lower; until not valid_index(i) loop Result := Result + item(i); i := i + 1; end; end -- sum mean: DOUBLE is require count > 0; do Result := sum / count.to_double; end -- mean standard_deviation: DOUBLE is require count > 1; local i: INTEGER; this_term_difference: DOUBLE; top_term: DOUBLE; do from i := lower; until not valid_index(i) loop this_term_difference := item(i) - mean; top_term := top_term + this_term_difference ^ 2; i := i + 1; end; Result := (top_term / (count - 1)).sqrt; end -- standard_deviation end -- NUMBER_LIST |
class MAIN creation make feature { ANY } make is local number_list : NUMBER_LIST do !!number_list.make number_list.read_from_stream( io ) --now output results if ( number_list.count > 0 ) then std_output.put_string( "Mean: " ) std_output.put_double( number_list.mean ) else std_output.put_string( "Not enough entries for mean" ) end --if std_output.put_new_line if ( number_list.count > 1 ) then std_output.put_string( "Standard Deviation: " ) std_output.put_double( number_list.standard_deviation ) else std_output.put_string( "Not enough entries for standard deviation" ) end --if std_output.put_new_line end end-- MAIN |
A fairly simple tallying of time spent in phases and defects injected and removed. The forms used are listed in the following sections
Table 1-6. Project Plan Summary
Student: | Victor B. Putz | Date: | 991213 |
Program: | Standard Deviation | Program# | 1A |
Instructor: | Wells | Language: | C++ |
Time in Phase (min): | Plan | Actual | To Date | To Date% |
Planning | 14 | 14 | 100 | |
Design | 17 | 17 | 100 | |
Code | 24 | 24 | 100 | |
Compile | 5 | 5 | 100 | |
Test | 51 | 51 | 100 | |
Postmortem | 3 | 3 | 100 | |
Total | 90 | 114 | 114 | 100 |
Defects Injected | Actual | To Date | To Date % | |
Design | 0 | 0 | 100 | |
Code | 8 | 8 | 100 | |
Compile | 1 | 1 | 100 | |
Test | 3 | 3 | 100 | |
Total development | 12 | 12 | 100 | |
Defects Removed | Actual | To Date | To Date % | |
Planning | 0 | 0 | 100 | |
Design | 0 | 0 | 100 | |
Code | 0 | 0 | 100 | |
Compile | 7 | 7 | 100 | |
Test | 5 | 5 | 100 | |
Total development | 12 | 12 | 100 | |
After Development | 12 | 12 |
Eiffel code/compile/test |
Time in Phase (min) | Actual | To Date | To Date % |
Code | 15 | 45 | 100 |
Compile | 10 | 30 | 100 |
Test | 8 | 25 | 100 |
Total | 33 | 33 | 100 |
Defects Injected | Actual | To Date | To Date % |
Code | 6 | 6 | 100 |
Compile | 0 | 0 | 0 |
Test | 0 | 0 | 0 |
Total | 6 | 6 | 100 |
Defects Removed | Actual | To Date | To Date % |
Code | 0 | 0 | 0 |
Compile | 3 | 3 | 50 |
Test | 3 | 3 | 50 |
Total | 6 | 6 | 100 |
Table 1-7. Time Recording Log
Student: | Victor B. Putz | Date: | 991219 |
Instructor: | Wells | Program# | 1A |
Date | Start | Stop | Interruption time | Delta time | Phase | Comments |
991213 | 1558 | 1612 | 14 | plan | ||
1612 | 1622 | 10 | design | Setting up environment (automake etc) | ||
1623 | 1630 | 7 | design | Class design, etc. | ||
1630 | 1654 | 24 | code | |||
1655 | 1701 | 1 | 5 | compile | 1 interruption (phone call) | |
1702 | 1753 | 51 | test | problems with "new ifstream" construct | ||
1828 | 1831 | 3 | postmortem | does not include sgml writeups |
Table 1-8. Defect Reporting Log
Student: | Victor B. Putz | Date: | 991219 |
Instructor: | Wells | Program# | 1A |
Date | Number | Type | Inject | Remove | Fix time | Fix defect | Description |
991219 | 1 | 20 | code | compile | 1 | Mistyped "input_stream" instead of "istream" | |
991219 | 2 | 210 | code | compile | 1 | forgot to include "fstream.h" | |
991219 | 3 | 20 | code | compile | 1 | attempted non-const operation on const reference | |
991219 | 4 | 20 | code | compile | 1 | attempted use of ^ operator on doubles | |
991219 | 5 | 20 | code | compile | 1 | mistyped input_stream instead of istream (again) | |
991219 | 6 | 20 | compile | compile | 1 | ? (forgot to put a description on the form) | |
991219 | 7 | 50 | code | compile | 1 | changed return type of main() unacceptably | |
991219 | 8 | 50 | test | test | 36 | Extreme difficulty dealing with ifstream-- eventually terminated | |
991219 | 9 | 50 | test | test | 1 | Processed EOF as a number (misuse of atof) | |
991219 | 10 | 80 | code | test | 1 | forgot to take sqrt of partial result in standard deviation | |
991219 | 11 | 50 | code | test | 3 | forgot to add error message in number_list::read_entry_from_stream | |
991219 | 12 | 20 | test | test | 1 | 11 | Forgot to update call to number_list::read_entry_from_stream in number_list::read_from_stream after adding a parameter. |
Eiffel |
991220 | 1 | 20 | code | compile | 1 | Forgot "then" on if-then block | |
991220 | 2 | 20 | code | compile | 1 | Numerous-- used "=" instead of ":=" for assignment | |
991220 | 3 | 20 | code | compile | 3 | couldn't figure out how to redefine make and call precursor | |
991220 | 4 | 80 | code | test | 1 | wasn't allowing for integer-style (no decimal) entries in data | |
991220 | 5 | 80 | code | test | 1 | wasn't incrementing indices in loops | |
991220 | 6 | 80 | code | test | 1 | Erroneously typed "-" instead of "/" for standard dev calc. |