My goodness! PSP1 comes complete with a very elaborate
script for calculating linear regression parameters and evolving a complex size and time
estimate. It's obvious that program 3a is intended to help, but it comes too late-- I wasted
a great deal of time calculating the linear regression parameters by hand, and simply
made an educated guess on the time estimate rather than running through the process again.
This portion needs tools in a big way. I actually recommend to anyone learning the PSP
that they get these tools ahead of time. I commend Humphrey on making tool construction
part of the syllabus, but the tools are coming after they are
needed at least once. Perhaps this is done to impress the student with the necessity of
automation, but it has the unfortunate effect of artificially inflating the time spent in
the planning phase (if the planning phase is incredibly long on one assignment because
of hand calculations and shortened dramatically on the next due to automation, it will skew
things a bit!).
I'm planning on reusing the number_list and
simple_input_parser classes from program 1A; through the estimation
script (which I think is artificially inflating estimates from programs 2A and 3A,
which ended up significantly larger than expected), I'm estimating
(through the PROBE process)
143 new LOC, and (through analogy and guessing)
about 2 hours of development time.
The design will differ very slightly from that evidenced by the
conceptual design used in planning. I will indeed reuse the number_list
class from lesson 1, but will delete i/o functionality from it, creating a
paired_number_list class, which holds two
number_lists. I'm removing the I/O in order to
refactor it into a new class, and taking two lists of single numbers
as it will work better than adapting
the original class to hold pairs of numbers, because the original mean() functionality
will be easily encapsulated (and we use the means of each runs in the linear
regression calculations).
The i/o will be brought in by reusing the simple_input_parser
class from programs 2A and 3A. A third class will mix the two using multiple inheritance
(paired_number_list_reader), interpreting an i/o stream as a
set of comma-separated value lines in the form of x,y. The main program will require very minor
changes (new instantiation, new printing instructions).
Not too bad; this was a fairly straightforward adaptation of the design. I did
discover that I'd been doing the beta calculations for my estimate incorrectly; now, of
course, I have a tool to automate this for next time.
The simple_input_parser class was used
unchanged from programs 2A and 3A.
number_list was used almost whole, but with the removal of the i/o features.
/*
*/
#ifndef NUMBER_LIST_H
#define NUMBER_LIST_H
#include <list>
#include <iostream>
//a class which encapsulates a list of double values, adding the features of
//mean and standard deviation
class number_list : public list< double >
{
public:
double sum( void ) const;
double mean( void ) const;
double standard_deviation( void ) const;
int entry_count( void ) const;
void add_entry( double new_entry );
};
#endif
/*
*/ |
/*
*/
#include "number_list.h"
#include <assert.h>
#include <stdlib.h>
#include <math.h>
double
number_list::sum( void ) const
{
double Result = 0;
for ( list<double>::const_iterator iter = begin();
iter != end();
++iter )
{
Result += *iter;
}
return Result;
}
double
number_list::mean( void ) const
{
assert( entry_count() > 0 );
return sum() / entry_count();
}
double
number_list::standard_deviation( void ) const
{
assert( entry_count() > 1 );
double sum_of_square_differences = 0;
for ( list<double>::const_iterator iter = begin();
iter != end();
++iter )
{
const double this_square_difference = *iter - mean();
sum_of_square_differences += this_square_difference * this_square_difference;
}
return sqrt( sum_of_square_differences / ( entry_count() - 1 ));
}
int
number_list::entry_count( void ) const
{
return size();
}
void
number_list::add_entry( double new_entry )
{
push_back( new_entry );
}
/*
*/ |
/*
*/
#ifndef PAIRED_NUMBER_LIST_H
#define PAIRED_NUMBER_LIST_H
#ifndef NUMBER_LIST_H
#include "number_list.h"
#endif
class paired_number_list : public number_list
{
public:
//modifiers
void add_entry( double x, double y );
void reset( void );
paired_number_list( void );
//number of entries
int entry_count( void ) const;
double x_sum( void ) const;
double x_mean( void ) const;
double x_standard_deviation( void ) const;
double y_sum( void ) const;
double y_mean( void ) const;
double y_standard_deviation( void ) const;
double beta_1_numerator( void ) const;
double beta_1_denominator( void ) const;
double beta_1( void ) const;
double beta_0( void ) const;
protected:
number_list m_xs;
number_list m_ys;
};
#endif
/*
*/ |
/*
*/
#include "paired_number_list.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
void
paired_number_list::reset( void )
{
m_xs.clear();
m_ys.clear();
}
paired_number_list::paired_number_list( void )
{
reset();
}
int
paired_number_list::entry_count( void ) const
{
REQUIRE( m_xs.entry_count() == m_ys.entry_count() );
return m_xs.entry_count();
}
void
paired_number_list::add_entry( double x, double y )
{
m_xs.add_entry( x );
m_ys.add_entry( y );
}
double
paired_number_list::x_sum( void ) const
{
return m_xs.sum();
}
double
paired_number_list::x_mean( void ) const
{
return m_xs.mean();
}
double
paired_number_list::x_standard_deviation( void ) const
{
return m_xs.standard_deviation();
}
double
paired_number_list::y_sum( void ) const
{
return m_ys.sum();
}
double
paired_number_list::y_mean( void ) const
{
return m_ys.mean();
}
double
paired_number_list::y_standard_deviation( void ) const
{
return m_ys.standard_deviation();
}
double
paired_number_list::beta_1_numerator( void ) const
{
double Result = 0;
list<double>::const_iterator x_iter;
list<double>::const_iterator y_iter;
for( x_iter = m_xs.begin(), y_iter = m_ys.begin();
(x_iter != m_xs.end()) && (y_iter != m_ys.end());
++x_iter, ++y_iter )
{
Result += (*x_iter)*(*y_iter);
}
Result -= static_cast<double>(entry_count())*x_mean()*y_mean();
return Result;
}
double
paired_number_list::beta_1_denominator( void ) const
{
double Result = 0;
list<double>::const_iterator x_iter;
list<double>::const_iterator y_iter;
for( x_iter = m_xs.begin(), y_iter = m_ys.begin();
(x_iter != m_xs.end()) && (y_iter != m_ys.end());
++x_iter, ++y_iter )
{
Result += (*x_iter)*(*x_iter);
}
Result -= static_cast<double>(entry_count())*x_mean()*x_mean();
return Result;
}
double
paired_number_list::beta_1( void ) const
{
return beta_1_numerator() / beta_1_denominator();
}
double
paired_number_list::beta_0( void ) const
{
return y_mean() - beta_1() * x_mean();
}
/*
*/ |
/*
*/
#ifndef READABLE_PAIRED_NUMBER_LIST_H
#define READABLE_PAIRED_NUMBER_LIST_H
#ifndef PAIRED_NUMBER_LIST_H
#include "paired_number_list.h"
#endif
#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif
class readable_paired_number_list : public paired_number_list, public simple_input_parser
{
public:
virtual void parse_last_line( void );
virtual void reset( void );
void clear_error_flag( void );
void check_error( bool condition, const std::string& message );
void report_error( const std::string& message );
readable_paired_number_list::readable_paired_number_list( void );
protected:
int line_number;
bool error_found;
};
#endif
/*
*/ |
/*
*/
#include "readable_paired_number_list.h"
void
readable_paired_number_list::parse_last_line( void )
{
clear_error_flag();
//split the string around the comma
std::string::size_type comma_index = last_line().find( ',' );
check_error( comma_index == last_line().npos, "No comma" );
std::string x_string = last_line().substr( 0, comma_index );
std::string y_string = last_line().substr( comma_index + 1, last_line().length() );
//get values for each double and ensure they're valid
char* conversion_end = NULL;
double new_x = strtod( x_string.c_str(), &conversion_end );
check_error( conversion_end == x_string.data(), "X invalid" );
double new_y = strtod( y_string.c_str(), &conversion_end );
check_error( conversion_end == y_string.data(), "Y invalid" );
//add the entry
if ( ! error_found )
{
cout << "added: " << new_x << ", " << new_y << "\n";
add_entry( new_x, new_y );
}
}
void
readable_paired_number_list::clear_error_flag( void )
{
error_found = false;
}
void
readable_paired_number_list::reset( void )
{
line_number = 0;
clear_error_flag();
}
readable_paired_number_list::readable_paired_number_list( void ) :
paired_number_list(),
simple_input_parser()
{
reset();
}
void
readable_paired_number_list::check_error( bool condition, const std::string& message )
{
if ( condition )
{
error_found = true;
report_error( message );
}
}
void
readable_paired_number_list::report_error( const std::string& message )
{
cerr << "input:" << line_number << ":" << message << ":" << last_line() << "\n";
}
/*
*/ |
/*
*/
#include <fstream>
#include <iostream>
#include "string.h"
#ifndef READABLE_PAIRED_NUMBER_LIST_H
#include "readable_paired_number_list.h"
#endif
istream *
input_stream_from_args( int arg_count, const char** arg_vector )
{
istream* Result = NULL;
if ( arg_count == 1 )
{
Result = &cin;
}
else
{
const char* help_text = "\
PSP exercise 4A: Calculate the linear regression parameters from
a set of comma-separated values, from standard input.
Usage: \n \
\tpsp_1a\n\n";
cout << help_text;
}
return Result;
}
int main( int arg_count, const char** arg_vector )
{
//get the input stream, or print the help text as appropriate
istream* input_stream = input_stream_from_args( arg_count, arg_vector );
if ( input_stream != NULL )
{
//read the entries from the input stream
readable_paired_number_list a_list;
a_list.set_input_stream( input_stream );
a_list.parse_until_eof();
cout << a_list.entry_count() << " entries processed.\n";
//output the mean, as appropriate
if ( a_list.entry_count() > 0 )
{
cout << "Beta 0: " << a_list.beta_0() << "\n";
cout << "Beta 1: " << a_list.beta_1() << "\n";
}
else
{
cout << "Too few entries to calculate\n";
}
}
}
/*
*/ |
class NUMBER_LIST
inherit
LINKED_LIST[DOUBLE];
creation {ANY}
make
feature {ANY}
line_counter: INTEGER;
sum: DOUBLE is
local
i: INTEGER;
do
from
i := lower;
until
not valid_index(i)
loop
Result := Result + item(i);
i := i + 1;
end;
end -- sum
mean: DOUBLE is
require
count > 0;
do
Result := sum / count.to_double;
end -- mean
standard_deviation: DOUBLE is
require
count > 1;
local
i: INTEGER;
this_term_difference: DOUBLE;
top_term: DOUBLE;
do
from
i := lower;
until
not valid_index(i)
loop
this_term_difference := item(i) - mean;
top_term := top_term + this_term_difference ^ 2;
i := i + 1;
end;
Result := (top_term / (count - 1)).sqrt;
end -- standard_deviation
end -- NUMBER_LIST |
class PAIRED_NUMBER_LIST
creation
make
feature { NONE }
xs : NUMBER_LIST
ys : NUMBER_LIST
feature {ANY}
add_entry( new_x, new_y : DOUBLE ) is
do
xs.add_last( new_x )
ys.add_last( new_y )
std_output.put_double( new_x )
std_output.put_string( ", " )
std_output.put_double( new_y )
std_output.put_string( "%N" )
end
count : INTEGER is
do
Result := xs.count
end
x_sum : DOUBLE is
do
Result := xs.sum
end
x_mean : DOUBLE is
do
Result := xs.mean
end
x_standard_deviation : DOUBLE is
do
Result := xs.standard_deviation
end
y_sum : DOUBLE is
do
Result := ys.sum
end
y_mean : DOUBLE is
do
Result := ys.mean
end
y_standard_deviation : DOUBLE is
do
Result := ys.standard_deviation
end
entry_count : INTEGER is
do
Result := xs.count
end
beta_1_numerator : DOUBLE is
local
index : INTEGER
do
Result := 0
from
index := xs.lower
until
not ( xs.valid_index( index ) and ys.valid_index( index ) )
loop
Result := Result + xs.item(index) * ys.item( index )
index := index + 1
end
Result := Result - entry_count.to_double * x_mean * y_mean
end
beta_1_denominator : DOUBLE is
local
index : INTEGER
do
Result := 0
from
index := xs.lower
until
not ( xs.valid_index( index ) and ys.valid_index( index ) )
loop
Result := Result + xs.item(index) * xs.item( index )
index := index + 1
end
Result := Result - entry_count.to_double * x_mean * x_mean
end
beta_1 : DOUBLE is
do
Result := beta_1_numerator / beta_1_denominator
end
beta_0 : DOUBLE is
do
Result := y_mean - beta_1 * x_mean
end
make is
do
!!xs.make
!!ys.make
end
reset is
do
xs.clear
ys.clear
end
invariant
xs.count = ys.count
end -- PAIRED_NUMBER_LIST
|
class READABLE_PAIRED_NUMBER_LIST
inherit
SIMPLE_INPUT_PARSER
redefine
parse_last_line
end
PAIRED_NUMBER_LIST
redefine
make
end
creation
make
feature {ANY}
make is
do
Precursor
end
parse_last_line is
--read comma-separated numbers
local
comma_index : INTEGER
x_string : STRING
y_string : STRING
new_x : INTEGER
new_y : INTEGER
do
clear_error_flag
comma_index := last_line.index_of( ',' )
check_for_error( comma_index = last_line.count + 1, "No comma" )
x_string := last_line.substring( 1, comma_index - 1 )
y_string := last_line.substring( comma_index + 1, last_line.count )
check_for_error( not ( x_string.is_double or x_string.is_integer), "invalid X" )
check_for_error( not (y_string.is_double or y_string.is_integer), "invalid Y" )
if not error_flag then
add_entry( double_from_string( x_string ), double_from_string( y_string ) )
end
end
double_from_string ( string : STRING ) : DOUBLE is
require
string.is_double or string.is_integer
do
if string.is_double then
Result := string.to_double
elseif string.is_integer then
Result := string.to_integer.to_double
end
end
error_flag : BOOLEAN
clear_error_flag is
do
error_flag := false
end
check_for_error( condition : BOOLEAN; message : STRING ) is
do
if condition then
error_flag := true
report_error( message )
end
end
report_error( message : STRING ) is
do
std_error.put_string( message + ":" + last_line + "%N" )
end
end
|
class MAIN
creation
make
feature { ANY }
make is
local
number_list : READABLE_PAIRED_NUMBER_LIST
do
!!number_list.make
number_list.set_input( io )
number_list.parse_until_eof
--now output results
if ( number_list.count > 0 ) then
std_output.put_string( "Beta 0: " )
std_output.put_double( number_list.beta_0 )
std_output.put_string( "%NBeta 1: " )
std_output.put_double( number_list.beta_1 )
std_output.put_string( "%N" )
else
std_output.put_string( "Not enough entries for calculations" )
end --if
end
end-- MAIN |
Once again, I learn my lessons on syntax errors and proper use of
components. After some time, and some searching through the header files,
I finally found the "c_str()" function for C++ standard library strings;
I had erroneously been using data(), which is not necessarily terminated.
A good book probably would have shown this to me (the header files are not
very helpful, and C++ templates are rather difficult to understand).
While I did have some minor problems with the Eiffel standard
library, the HTML documentation generated by the
short utility has been very readable and extremely
useful. While some standard Eiffel library features could be more useful
(for example, class STRING comes with a split
feature-- but only splits on a set of defined separators, not including the
comma), it's very solid and readable.
Few problems here. The few which existed came from either poor
use of libraries or my own error misunderstanding the linear regression
algorithm.
Table 4-2. Test Results Format-- Program 4A (results essentially identical for Eiffel and C++ versions)
Test | Expected B0 | Expected B1 | Actual B0 | Actual B1 |
Table D8: Estimated Class vs Actual New and Changed LOC | -22.55 | 1.7279 | -22.5525 | 1.72793 |
Table D8: Estimated New and Changed LOC
vs Actual New and Changed LOC | -23.92 | 1.4310 | -23.9239 | 1.42097 |
Programs 2A, 3A, 4A: Estimated New and Changed LOC
vs Actual New and Changed LOC | | | 850.905 | -5.32154 |
Incidentally, the correct beta values for my own code should have
shifted my new/changed estimate from 143 LOC to a
ludicrous 569 LOC. I think that lessons 2 and 3 probably
warped things quite a bit, and now wish that I was not including them in my data.
Table 4-3. Project Plan Summary
Student: | Victor B. Putz | Date: | 000106 |
Program: | Regression Beta Analyzer | Program# | 4A |
Instructor: | Wells | Language: | C++ |
Summary LOC/Hour | Plan | Actual | To date |
| ? | 41 | 41 |
Program Size | Plan | Actual | To date |
Base | 28 | 28 | |
Deleted | 0 | 6 | |
Modified | 4 | 4 | |
Added | 139 | 129 | |
Reused | 106 | 87 | 87 |
Total New and Changed | 143 | 129 | 686 |
Total LOC | 273 | 236 | 966 |
Total new/reused | 0 | 0 | 0 |
Time in Phase (min): | Plan | Actual | To Date | To Date% |
Planning | 7 | 79 | 116 | 14 |
Design | 13 | 10 | 79 | 10 |
Code | 35 | 41 | 226 | 27 |
Compile | 8 | 9 | 52 | 6 |
Test | 47 | 33 | 287 | 35 |
Postmortem | 10 | 16 | 66 | 8 |
Total | 120 | 188 | 826 | 100 |
Defects Injected | | Actual | To Date | To Date % |
Plan | | 0 | 0 | 0 |
Design | | 6 | 21 | 26 |
Code | | 9 | 55 | 68 |
Compile | | 0 | 3 | 3 |
Test | | 0 | 3 | 3 |
Total development | | 15 | 82 | 100 |
Defects Removed | | Actual | To Date | To Date % |
Planning | | 0 | 0 | 0 |
Design | | 0 | 0 | 0 |
Code | | 4 | 19 | 23 |
Compile | | 9 | 35 | 43 |
Test | | 2 | 28 | 34 |
Total development | | 15 | 82 | 100 |
After Development | | 0 | 0 | |
Time in Phase (min) | Actual | To Date | To Date % |
Code | 28 | 115 | 43 |
Compile | 7 | 63 | 24 |
Test | 15 | 89 | 33 |
Total | 50 | 267 | 100 |
Defects Injected | Actual | To Date | To Date % |
Design | 0 | 4 | 8 |
Code | 7 | 45 | 90 |
Compile | 0 | 0 | 0 |
Test | 1 | 1 | 2 |
Total | 8 | 50 | 100 |
Defects Removed | Actual | To Date | To Date % |
Code | 0 | 1 | 2 |
Compile | 4 | 28 | 56 |
Test | 4 | 21 | 42 |
Total | 8 | 50 | 100 |
Table 4-4. Time Recording Log
Student: | Victor B. Putz | Date: | 000106 |
| | Program: | 4A |
Start | Stop | Interruption Time | Delta time | Phase | Comments |
000106 17:18:21 | 000106 18:38:07 | 0 | 79 | plan | |
000107 10:12:58 | 000107 10:23:24 | 0 | 10 | design | |
000107 10:29:27 | 000107 11:10:40 | 0 | 41 | code | |
000107 11:12:34 | 000107 11:21:44 | 0 | 9 | compile | |
000107 11:21:50 | 000107 11:55:36 | 0 | 33 | test | |
000107 11:56:10 | 000107 12:12:38 | 0 | 16 | postmortem | |
| | | | | |
Table 4-5. Time Recording Log
Student: | Victor Putz | Date: | 000107 |
| | Program: | 4A |
Start | Stop | Interruption Time | Delta time | Phase | Comments |
000107 12:17:13 | 000107 12:44:52 | 0 | 27 | code | |
000107 12:44:56 | 000107 12:51:35 | 0 | 6 | compile | |
000107 12:51:48 | 000107 13:06:41 | 0 | 14 | test | |
| | | | | |
Table 4-6. Defect Recording Log
Student: | Victor B. Putz | Date: | 000106 |
| | Program: | 4A |
Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
000107 10:34:11 | ic | om | design | code | 1 | forgot to design a constructor
|
000107 10:58:03 | ic | om | design | code | 3 | no line counter in new i/o section; added
|
000107 11:01:22 | ic | om | design | code | 1 | Added error reporting
|
000107 11:04:12 | ic | om | design | code | 2 | added error flag for more beneficial error reporting
|
000107 11:12:37 | wn | ty | code | compile | 0 | mistyped "readable_paired_number_list"
|
000107 11:14:00 | sy | ty | code | compile | 0 | missed semicolon at end of class declaration
|
000107 11:14:45 | wn | ty | code | compile | 0 | typed a_liste instead of a_list
|
000107 11:15:24 | sy | om | code | compile | 0 | forgot const on end of method implementation
|
000107 11:16:11 | sy | om | code | compile | 0 | forgot to #include required file
|
000107 11:17:18 | sy | ig | code | compile | 1 | could not declare two iterators in for clause; moved declarations out
|
000107 11:19:43 | sy | ty | code | compile | 0 | forgot parentheses around no-argument feature
|
000107 11:20:14 | wn | om | code | compile | 0 | Forgot to prepend class name to clear_error method
|
000107 11:21:17 | sy | om | code | compile | 0 | forgot to declare constructor in class declaration (only added body in cpp file)
|
000107 11:25:49 | wn | ig | design | test | 5 | was using data() instead of c_str() to get c-readable string from the std::string class.
|
000107 11:35:00 | wa | ig | design | test | 15 | Incorrectly assumed that the summation symbol applied to the whole equation, not just the first part.
|
| | | | | | |
Table 4-7. Defect Recording Log
Student: | Victor Putz | Date: | 000107 |
| | Program: | 4A |
Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
000107 12:45:00 | ic | om | code | compile | 1 | forgot to add count feature
|
000107 12:47:02 | sy | om | code | compile | 0 | forgot to add "Redefine make" clause
|
000107 12:48:00 | wn | ig | code | compile | 0 | used wipe_out instead of clear to empty linked list
|
000107 12:50:38 | mc | ig | code | compile | 0 | forgot to call precursor in make
|
000107 12:53:05 | iv | ig | code | test | 0 | forgot-- Eiffel strings start at 1, not 0
|
000107 12:54:58 | wa | ig | code | test | 3 | forgot that to_double doesn't work with straight integers; added double_from_string feature
|
000107 12:58:43 | wn | ty | code | test | 4 | accidentally typed ys instead of xs
|
000107 13:04:37 | sy | ty | test | test | 0 | test data did not have a terminating newline after the last data entry
|
| | | | | | |