| The Personal Software Process: an Independent Study | ||
|---|---|---|
| Prev | Chapter 10. Lesson 10: Software Design, part II | Next |
Using a linked list, write a program to do a chi-squared test for a normal distribution.
Requirements: Write program 9a to calculate the degree to which a string of n real numbers is normally distributed. The methods and formulas used in this calculation are explained on page 529 [of the text]. Use program 5a to calculate the values of the normal and chi-squared distributions. Assume than n is > 20 and an even multiple of 5. Use program 8a to sort the numbers in ascending order. Testing: Thoroughly test the program. As one test, use the LOC/Method data in table D14 as a test case [note: this is the same data from program 8a]. Here, the result should be Q=34.4 with a probability value < 0.005 that the data are normally distributed. The solution to this case is shown [in the text]. Submit a test report that includes the test results and uses the format in table D15 Table 10-1. D15: Test Results Format -- program 9a
| ||||||||||||||
| --[Humphrey95] | ||||||||||||||
I'll make heavy reuse of the parsing and number array list classes used in program 8a, which means that the data input format will remain the same (note-- in program 8a, I suggested that the first usable line of the input file be a single number, the number of fields in the file; since the number of fields can be extrapolated from the first historical data entry, this requirement has been removed).
So much for getting the numbers into the program. Where we run into a problem is the next bit-- which series are we going to check for normality? If this is going to be reused, how shall we decide which series to use?
The simple response would be to simply check all the series, but I'll do something slightly different-- after the historical data terminator, I'll make the input consist of commands and numbers; in other words, something like this:
... 58, 19.33 305, 25.42 stop -- end of data normality-check-on-series 1 -- request a normality check on the first series |
This will allow us to add further commands in the future, paving the way for some minor expandibility (considering that there is only one more program in this series, that may be unnecessary).
When the normality-check-on-series command is encountered, the program will perform a normality check on the given series (series 1 being the first set of numbers, and so on), printing the Q and 1-p results, thusly:
Normality check on series: 1 Q: 34.4 (1-p): 7.6e-5 |
To do the size estimate-- and to make it worthwhile-- I'm going to do a bit more of a conceptual design than I have earlier, listing my appropriate classes and the methods I think I'll need. The algorithm in use in program 9a is considerably more complex than those used in other programs (or at least it seems to me), so I need to do more preparation here.
A quick conceptual design using Dia gives us the following:

Note the heavy use of single_variable_function subclasses (it's not depicted well, but each class in the preliminary design except for number_array_list_2 and number_array_list_parser_2 is a subclass of single_variable_function. The normal_distribution_inverse class will be used to calculate the values of the normal distribution for S segments. The tex lists these for certain values of S (table A24 in [Humphrey95]), and we can use the table (since the requirements guarantee the numbers will be divisible by 5) but this will be more general.
With that in mind, our size estimating template gives us a PROBE-generated estimate of 239 new/changed LOC.
Historical data gives us a PROBE-generated estimate of 285 minutes for this project.
Once again, I'm using a cluster of tiny single_variable_function classes to do much of the work. I also reconstruct the "by-hand" tables as per the algorithm in [Humphrey95]-- not because they are necessary for the computation, but because they will simplify testing by showing results similar to those in the text. Our preliminary design is fleshed out as follows:

Actually, not all that many defects were caught by the design review this time around, and few of those were show-stoppers. The basic design seems reasonable.
The error_log, gamma_function, is_double_equal, normal_function_base, simple_input_parser, simpson_integrator, square, and whitespace_stripper modules were used in their entirety. Much code from number_array_list and number_array_list_parser was reused, but those classes are reprinted here since much was changed in number_array_list.
Part of the design had to be changed because I had originally planned to inherit from number_array_list to create number_array_list_2-- unfortunately, many of the functions returned by number_array_list did not (of course) return values of number_array_list_2-- meaning that I couldn't chain together functions as easily as I wanted to, which was unacceptable; as a result, I just expanded the original class. I did not have a chance to verify if "like current" in Eiffel would fix these problems (it may well have done so!)
#ifndef ADDER_H
#define ADDER_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class adder : public single_variable_function
{
protected:
double addend;
public:
adder( double new_addend );
virtual double at( double x ) const;
};
#endif |
#include "adder.h"
adder::adder( double new_addend ) :
addend( new_addend )
{
}
double
adder::at( double x ) const
{
return x + addend;
}
|
/*
*/
#ifndef CHI_SQUARED_BASE_H
#define CHI_SQUARED_BASE_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class chi_squared_base : public single_variable_function
{
protected:
int n;
public:
virtual double at( double x ) const;
double base( void ) const;
chi_squared_base( int new_n );
};
#endif
/*
*/ |
/*
*/
#include "chi_squared_base.h"
#ifndef GAMMA_FUNCTION_H
#include "gamma_function.h"
#endif
#include <math.h>
#include <iostream>
double
chi_squared_base::at( double x ) const
{
const double Result = base()
* pow( x, ( static_cast<double>(n) / 2.0 - 1.0 ) )
* exp( -x / 2.0 );
// if ( x == 0 || x == 10 )
// {
// gamma_function gamma;
// cout << "Chibase at " << x << "\n"
// << "n: " << n << "\n"
// << "2^(n/2) " << pow(2.0,static_cast<double>(n)/2.0) << "\n"
// << "gamma(n/2)" << gamma.at( static_cast<double>(n)/2.0 ) << "\n"
// << "Base: " << base() << "\n"
// << "x^(n/2 - 1): " << pow( x, ( static_cast< double >(n) / 2.0 - 1.0 ) ) << "\n"
// << "e^(-x/2) :" << exp( -x / 2.0 ) << "\n";
// }
return Result;
}
chi_squared_base::chi_squared_base( int new_n ) :
n ( new_n )
{
}
double
chi_squared_base::base( void ) const
{
gamma_function gamma;
const double Result = 1 / ( pow( 2.0, static_cast<double>(n)/2.0 ) * gamma.at( static_cast<double>(n)/2.0 ) );
return Result;
}
/*
*/ |
#ifndef CHI_SQUARED_INTEGRAL_H
#define CHI_SQUARED_INTEGRAL_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class chi_squared_integral : public single_variable_function
{
protected:
int n;
public:
virtual double at( double x ) const;
chi_squared_integral( int new_n );
};
#endif |
/*
*/
#include "chi_squared_integral.h"
#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef SIMPSON_INTEGRATOR_H
#include "simpson_integrator.h"
#endif
#include <math.h>
double
chi_squared_integral::at( double x ) const
{
REQUIRE( x >= 0 );
simpson_integrator homer;
chi_squared_base chi_base( n );
double Result = homer.integral( chi_base, 0, x );
return Result;
}
chi_squared_integral::chi_squared_integral( int new_n ) :
n( new_n )
{
REQUIRE( new_n > 0 );
}
/*
*/ |
#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#define NORMAL_DISTRIBUTION_INVERSE_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class normal_distribution_inverse : public single_variable_function
{
public:
virtual double at( double x ) const;
double next_guess( double arg, double last_result, double target ) const;
};
#endif |
/*
*/
#include "normal_distribution_inverse.h"
#ifndef NORMAL_DISTRIBUTION_INTEGRAL_H
#include "normal_distribution_integral.h"
#endif
#include <math.h>
double
normal_distribution_inverse::at( double x ) const
{
const double target = x;
const double error_margin = 0.0000001;
double last_error = 0;
double Result = 0;
double this_result = 0;
bool has_tried_once = false;
normal_distribution_integral norm;
while ( !has_tried_once || ( last_error > error_margin ) )
{
const double last_result = this_result;
this_result = norm.at( Result );
last_error = fabs( this_result - target );
if ( last_error > error_margin )
{
Result = next_guess( Result, last_result, target );
}
has_tried_once = true;
}
return Result;
}
double
normal_distribution_inverse::next_guess( double arg, double last_result, double target ) const
{
double Result = arg + ( target - last_result );
return Result;
}
/*
*/ |
#ifndef NORMALIZER_H
#define NORMALIZER_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class normalizer : public single_variable_function
{
protected:
double data_mean;
double standard_deviation;
public:
normalizer( double new_data_mean, double new_standard_deviation );
virtual double at( double x ) const;
};
#endif |
#include "normalizer.h"
normalizer::normalizer( double new_data_mean, double new_standard_deviation ) :
data_mean( new_data_mean ),
standard_deviation( new_standard_deviation )
{
}
double
normalizer::at( double x ) const
{
return ( x - data_mean ) / standard_deviation;
}
|
/*
*/
#ifndef NUMBER_ARRAY_LIST_H
#define NUMBER_ARRAY_LIST_H
#include <list>
#include <vector>
#include <string>
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class number_array_list : public std::list< std::vector< double > >
{
protected:
int m_field_count;
public:
int field_count( void ) const;
std::vector< double > head( void ) const;
number_array_list tail( void ) const;
number_array_list suffixed_by_item( const std::vector< double >& entry ) const;
number_array_list suffixed_by_list( const number_array_list& rhs ) const;
number_array_list mapped_to( int field_index, const single_variable_function& f ) const;
number_array_list multiplied_by_list( int field_index, const number_array_list& rhs ) const;
number_array_list sorted_by( int field_index ) const;
number_array_list items_less_than( int field_index, double value ) const;
number_array_list items_greater_than_or_equal_to( int field_index, double value ) const;
bool is_valid_entry( const std::vector< double >& entry ) const;
bool is_valid_field_index( int field_index ) const;
bool is_sorted_by( int field_index ) const;
double sum_by_field( int field_index ) const;
double mean_by_field( int field_index ) const;
int entry_count( void ) const;
void set_field_count( int new_field_count );
void add_entry( const std::vector< double >& entry );
void make_from_entry( const std::vector< double >& entry );
void make( void );
void make_from_list( const number_array_list& rhs );
number_array_list( void );
double standard_deviation( int field_index, bool is_full_population ) const;
double variance( int field_index, bool is_full_population ) const;
number_array_list normalized_series( int field_index ) const;
int normalization_s_value( void ) const;
number_array_list normalized_series_table( int field_index ) const;
number_array_list normal_distribution_segments( int s ) const;
number_array_list chi_squared_table( int field_index ) const;
double chi_squared_q( int field_index ) const;
double chi_squared_1_minus_p( int field_index ) const;
number_array_list select_series( int field_index ) const;
int items_in_range( int field_index, double lower_limit, double upper_limit ) const;
std::string entry_as_string( const std::vector< double >& entry ) const;
std::string table_as_string( void ) const;
};
#endif
/*
*/ |
/*
*/
#include "number_array_list.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef ADDER_H
#include "adder.h"
#endif
#ifndef SQUARE_H
#include "square.h"
#endif
#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#include "normal_distribution_inverse.h"
#endif
#ifndef CHI_SQUARED_INTEGRAL_H
#include "chi_squared_integral.h"
#endif
#ifndef NORMALIZER_H
#include "normalizer.h"
#endif
#include <math.h>
#include <stdio.h>
int
number_array_list::field_count( void ) const
{
return m_field_count;
}
std::vector< double >
number_array_list::head( void ) const
{
REQUIRE( entry_count() >= 1 );
return *(begin());
}
number_array_list
number_array_list::tail( void ) const
{
number_array_list Result;
if ( entry_count() > 1 )
{
number_array_list::const_iterator tail_head = begin();
++tail_head;
Result.insert( Result.begin(), tail_head, end() );
Result.m_field_count = field_count();
ENSURE( Result.entry_count() == entry_count() - 1 );
}
return Result;
}
number_array_list
number_array_list::suffixed_by_item( const std::vector< double >& entry ) const
{
number_array_list Result;
Result.make_from_list( *this );
Result.add_entry( entry );
ENSURE( Result.entry_count() == entry_count() + 1 );
return Result;
}
number_array_list
number_array_list::suffixed_by_list( const number_array_list& rhs ) const
{
number_array_list Result;
Result.make_from_list( *this );
for ( number_array_list::const_iterator iter = rhs.begin(); iter != rhs.end(); ++iter )
{
Result.add_entry( *iter );
}
ENSURE( Result.entry_count() == entry_count() + rhs.entry_count() );
return Result;
}
number_array_list
number_array_list::mapped_to( int field_index, const single_variable_function& f ) const
{
REQUIRE( is_valid_field_index( field_index ) );
number_array_list Result;
if ( entry_count() > 0 )
{
std::vector<double> new_entry = head();
new_entry[ field_index ] = f.at( head()[ field_index ] );
Result.make_from_list( Result.suffixed_by_item( new_entry ).suffixed_by_list( tail().mapped_to( field_index, f )));
}
ENSURE( Result.entry_count() == entry_count() );
return Result;
}
number_array_list
number_array_list::multiplied_by_list( int field_index, const number_array_list& rhs ) const
{
REQUIRE( entry_count() == rhs.entry_count() );
REQUIRE( field_count() == rhs.field_count() );
REQUIRE( is_valid_field_index( field_index ) );
number_array_list Result;
if ( entry_count() > 0 )
{
std::vector< double > new_entry = head();
new_entry[ field_index ] = head()[ field_index ] * rhs.head()[field_index];
Result.make_from_list( Result.suffixed_by_item( new_entry ).
suffixed_by_list( tail().multiplied_by_list( field_index, rhs ) ) );
}
ENSURE( Result.entry_count() == entry_count() );
return Result;
}
number_array_list
number_array_list::sorted_by( int field_index ) const
{
REQUIRE( is_valid_field_index( field_index ) );
number_array_list Result;
if ( is_sorted_by( field_index ) )
{
Result.make_from_list( *this );
}
else
{
Result.make_from_list( tail().items_less_than( field_index, head()[field_index] ).sorted_by( field_index ).
suffixed_by_item( head() ).
suffixed_by_list( tail().items_greater_than_or_equal_to( field_index, head()[field_index] ).sorted_by( field_index )));
}
ENSURE( Result.entry_count() == entry_count() );
return Result;
}
number_array_list
number_array_list::items_less_than( int field_index, double value ) const
{
REQUIRE( is_valid_field_index( field_index ) );
number_array_list Result;
for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
{
if ( (*iter)[field_index] < value )
{
Result.add_entry( *iter );
}
}
ENSURE( Result.entry_count() <= entry_count() );
return Result;
}
number_array_list
number_array_list::items_greater_than_or_equal_to( int field_index, double value ) const
{
REQUIRE( is_valid_field_index( field_index ) );
number_array_list Result;
for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
{
if ( (*iter)[field_index] >= value )
{
Result.add_entry(*iter);
}
}
ENSURE( Result.entry_count() <= entry_count() );
return Result;
}
bool
number_array_list::is_valid_entry( const std::vector<double>& entry ) const
{
bool Result = false;
if ( ( entry_count() == 0 )
|| ( ( entry_count() > 0 ) && ( entry.size() == field_count() ) ) )
{
Result = true;
}
return Result;
}
bool
number_array_list::is_valid_field_index( int field_index ) const
{
bool Result = true;
if ( entry_count() > 0 )
{
Result = ( 0 <= field_index ) && ( field_index < m_field_count );
}
return Result;
}
bool
number_array_list::is_sorted_by( int field_index ) const
{
REQUIRE( is_valid_field_index( field_index ) );
bool Result = true;
if ( entry_count() > 1 )
{
Result = ( head()[field_index] < tail().head()[field_index] ) && tail().is_sorted_by( field_index );
}
return Result;
}
double
number_array_list::sum_by_field( int field_index ) const
{
REQUIRE( is_valid_field_index( field_index ) )
double Result = 0;
if ( entry_count() > 0 )
{
Result = head()[field_index] + tail().sum_by_field( field_index );
}
return Result;
}
double
number_array_list::mean_by_field( int field_index ) const
{
REQUIRE( is_valid_field_index( field_index ) );
REQUIRE( entry_count() > 0 );
double Result = sum_by_field( field_index ) / static_cast<double>( entry_count() );
return Result;
}
int
number_array_list::entry_count( void ) const
{
return size();
}
void
number_array_list::set_field_count( int new_field_count )
{
REQUIRE( entry_count() == 0 );
m_field_count = new_field_count;
ENSURE( field_count() == new_field_count );
}
void
number_array_list::add_entry( const std::vector< double >& entry )
{
if ( entry_count() == 0 )
{
set_field_count( entry.size() );
}
REQUIRE( is_valid_entry( entry ) );
const int old_entry_count = entry_count();
push_back( entry );
ENSURE( entry_count() == old_entry_count + 1 );
}
void
number_array_list::make_from_entry( const std::vector<double>& entry )
{
make();
add_entry( entry );
ENSURE( entry_count() == 1 );
ENSURE( head() == entry );
}
void
number_array_list::make( void )
{
clear();
m_field_count = -1;
ENSURE( m_field_count == -1 );
ENSURE( entry_count() == 0 );
}
void
number_array_list::make_from_list( const number_array_list& rhs )
{
make();
insert( begin(), rhs.begin(), rhs.end() );
m_field_count = rhs.field_count();
ENSURE( entry_count() == rhs.entry_count() );
}
number_array_list::number_array_list( void )
{
make();
}
double
number_array_list::standard_deviation( int field_index, bool is_full_population ) const
{
double Result = sqrt( variance( field_index, is_full_population ) );
return Result;
}
double
number_array_list::variance( int field_index, bool is_full_population ) const
{
adder an_adder( -mean_by_field( field_index ) );
square a_square;
if ( is_full_population )
{
CHECK( entry_count() > 0 );
}
else
{
CHECK( entry_count() > 1 );
}
const double divisor = static_cast< double >( is_full_population ? entry_count() : entry_count() - 1 );
const double Result = mapped_to( field_index, an_adder ).
mapped_to( field_index, a_square ).sum_by_field( field_index ) / divisor;
return Result;
}
number_array_list
number_array_list::normalized_series( int field_index ) const
{
//cout << "Standard dev: " << standard_deviation( field_index, false ) << "\n";
normalizer a_normalizer( mean_by_field( field_index ), standard_deviation( field_index, false ) );
number_array_list Result = mapped_to( field_index, a_normalizer ).select_series( field_index );
//cout << "**in normalized_series**\n" << mapped_to( field_index, a_normalizer ).table_as_string() << "\n";
ENSURE( Result.entry_count() == entry_count() );
//cout << "**normalized_series result**\n" << Result.table_as_string() << "\n";
return Result;
}
int
number_array_list::normalization_s_value( void ) const
{
REQUIRE( entry_count() % 5 == 0 );
int Result = entry_count() / 5;
return Result;
}
number_array_list
number_array_list::normalized_series_table( int field_index ) const
{
REQUIRE( entry_count() > 0 );
number_array_list sorted_list;
sorted_list.make_from_list( sorted_by( field_index ) );
number_array_list sorted_normalized_list = sorted_list.normalized_series( field_index );
//cout << "Sorted list: \n" << sorted_list.table_as_string() << "\n";
//cout << "sorted normalized_list: \n" << sorted_normalized_list.table_as_string() << "\n";
CHECK( sorted_list.entry_count() == entry_count() );
CHECK( sorted_normalized_list.entry_count() == entry_count() );
number_array_list::const_iterator sorted_list_iter;
number_array_list::const_iterator sorted_normalized_list_iter;
int i = 0;
number_array_list Result;
for( i = 0, sorted_list_iter = sorted_list.begin(), sorted_normalized_list_iter = sorted_normalized_list.begin();
sorted_list_iter != sorted_list.end() && sorted_normalized_list_iter != sorted_normalized_list.end();
)//++i, ++sorted_list_iter, ++sorted_normalized_list_iter )
{
std::vector<double>new_entry;
new_entry.push_back( static_cast<double>(i + 1) );
new_entry.push_back( static_cast<double>(i + 1) / static_cast<double>(entry_count()));
new_entry.push_back( (*sorted_list_iter)[ field_index ] );
new_entry.push_back( (*sorted_normalized_list_iter)[ 0 ] );
Result.add_entry( new_entry );
++i;
++sorted_list_iter;
++sorted_normalized_list_iter;
}
ENSURE( Result.entry_count() == entry_count() );
return Result;
}
number_array_list
number_array_list::normal_distribution_segments( int s ) const
{
REQUIRE( s > 0 );
const double norm_floor = -10000;
const double norm_ceiling = 10000;
normal_distribution_inverse norm_inverse;
number_array_list Result;
for ( int i = 1; i <= s; ++i )
{
std::vector< double > new_entry;
if ( i == 1 )
{
new_entry.push_back( norm_floor );
}
else
{
new_entry.push_back( norm_inverse.at( ( static_cast<double>(i - 1) )
* ( 1.0 / static_cast< double >( s ) )));
}
if ( i == s )
{
new_entry.push_back( norm_ceiling );
}
else
{
new_entry.push_back( norm_inverse.at( ( static_cast< double >( i ) )
* ( 1.0 / static_cast< double >( s ) )));
}
Result.add_entry( new_entry );
}
return Result;
}
number_array_list
number_array_list::chi_squared_table( int field_index ) const
{
number_array_list normal_segments = normal_distribution_segments( normalization_s_value() );
number_array_list norm_table = normalized_series_table( field_index );
number_array_list Result;
square a_square;
//cout << "norm table\n" << norm_table.table_as_string() << "\n\n";
int i = 0;
number_array_list::const_iterator iter;
for ( i = 0, iter = normal_segments.begin(); iter != normal_segments.end(); ++iter, ++i )
{
std::vector< double > new_entry;
new_entry.push_back( i );
new_entry.push_back( (*iter)[0] );
new_entry.push_back( (*iter)[1] );
new_entry.push_back( static_cast< double >(norm_table.entry_count()) / norm_table.normalization_s_value() );
new_entry.push_back( static_cast< double >(norm_table.items_in_range( 3, new_entry[1], new_entry[2] ) ) );
new_entry.push_back( a_square.at( new_entry[3] - new_entry[4] ) );
new_entry.push_back( new_entry[5] / new_entry[ 3 ] );
Result.add_entry( new_entry );
}
return Result;
}
double
number_array_list::chi_squared_q( int field_index ) const
{
//cout << chi_squared_table( field_index ).table_as_string();
double Result = chi_squared_table( field_index ).sum_by_field( 6 );
return Result;
}
#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif
double
number_array_list::chi_squared_1_minus_p( int field_index ) const
{
chi_squared_integral chi( normalization_s_value() - 1 );
double Result = 1 - chi.at( chi_squared_q( field_index ) );
return Result;
}
number_array_list
number_array_list::select_series( int field_index ) const
{
number_array_list Result;
if ( entry_count() > 0 )
{
std::vector< double >new_entry;
new_entry.push_back( head()[field_index] );
Result.add_entry( new_entry );
}
if ( entry_count() > 1 )
{
Result.make_from_list( Result.suffixed_by_list( tail().select_series( field_index ) ));
}
ENSURE( Result.entry_count() == entry_count() );
ENSURE( Result.field_count() == 1 );
return Result;
}
int
number_array_list::items_in_range( int field_index, double lower_limit, double upper_limit ) const
{
int Result = 0;
if (( entry_count() > 0 )
&& ( head()[field_index] > lower_limit )
&& ( head()[field_index] <= upper_limit ) )
{
Result = 1;
}
if ( entry_count() > 1 )
{
Result = Result + tail().items_in_range( field_index, lower_limit, upper_limit );
}
return Result;
}
std::string double_as_string( double x )
{
char buffer[ 1000 ];
sprintf( buffer, "%f", x );
return std::string( buffer );
}
std::string
number_array_list::entry_as_string( const std::vector< double >& entry ) const
{
std::string Result = "( ";
for ( std::vector< double >::const_iterator iter = entry.begin(); iter != entry.end(); ++iter )
{
Result = Result + double_as_string( *iter );
if ( ( iter + 1 ) != entry.end() )
{
Result = Result + ", ";
}
}
Result = Result + " )";
return Result;
}
std::string
number_array_list::table_as_string( void ) const
{
std::string Result = "";
if ( entry_count() > 0 )
{
Result = Result + entry_as_string( head() ) + "\n";
}
if ( entry_count() > 1 )
{
Result = Result + tail().table_as_string();
}
return Result;
}
/*
*/ |
#ifndef NUMBER_ARRAY_LIST_PARSER_H
#define NUMBER_ARRAY_LIST_PARSER_H
#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif
#ifndef NUMBER_ARRAY_LIST_H
#include "number_array_list.h"
#endif
class number_array_list_parser_2: public simple_input_parser
{
protected:
number_array_list number_list;
enum state_t { Parsing_historical_data, Parsing_commands };
state_t state;
public:
virtual std::string transformed_line (const std::string & line) const;
std::string string_stripped_of_comments (const std::string & str) const;
static bool is_double (const std::string & str);
static double double_from_string (const std::string & str);
static const std::string & historical_data_terminator;
static const std::string & inline_comment_begin;
bool last_line_is_blank (void);
virtual void parse_last_line (void);
void parse_last_line_as_historical_data (void);
void parse_last_line_as_end_of_historical_data (void);
void parse_last_line_as_command( void );
void print_normalization_check( int field_index );
void make( void );
number_array_list_parser_2( void );
bool last_line_is_valid_historical_data( void ) const;
static std::vector< std::string > split_string( const std::string& string_to_split, const std::string& separator );
static std::string string_before_separator( const std::string& string_to_split, const std::string& separator );
static std::string string_after_separator( const std::string& string_to_split, const std::string& separator );
static std::vector< double > numbers_from_string( const std::string& string_to_split );
};
#endif |
/*
*/
#include "number_array_list_parser_2.h"
#ifndef WHITESPACE_STRIPPER_H
#include "whitespace_stripper.h"
#endif
#ifndef ERROR_LOG_H
#include "error_log.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif
void
number_array_list_parser_2::make (void)
{
simple_input_parser::reset ();
state = Parsing_historical_data;
number_list.make ();
}
std::string number_array_list_parser_2::transformed_line (const std::string & str) const
{
return whitespace_stripper::string_stripped_of_whitespace (string_stripped_of_comments (str));
}
std::string
number_array_list_parser_2::string_stripped_of_comments (const std::string & str) const
{
const std::string::size_type comment_index = str.find (inline_comment_begin);
return str.substr (0, comment_index);
}
bool
number_array_list_parser_2::is_double (const std::string & str)
{
bool
Result = true;
char *
conversion_end = NULL;
strtod (str.c_str (), &conversion_end);
if (conversion_end == str.data ())
{
Result = false;
}
return Result;
}
double
number_array_list_parser_2::double_from_string (const std::string & str)
{
REQUIRE (is_double (str));
return strtod (str.c_str (), NULL);
}
const
std::string & number_array_list_parser_2::historical_data_terminator = "stop";
const
std::string & number_array_list_parser_2::inline_comment_begin = "--";
bool number_array_list_parser_2::last_line_is_blank (void)
{
if (last_line ().length () == 0)
{
return true;
}
else
{
return false;
}
}
void
number_array_list_parser_2::parse_last_line (void)
{
if (last_line_is_blank ())
{
return;
}
if ( state == Parsing_historical_data )
{
if ( last_line () == historical_data_terminator)
{
parse_last_line_as_end_of_historical_data ();
}
else
{
parse_last_line_as_historical_data ();
}
}
else
{
parse_last_line_as_command();
}
}
void
number_array_list_parser_2::parse_last_line_as_historical_data (void)
{
if ( last_line_is_valid_historical_data() )
{
const std::vector< double >this_entry = numbers_from_string( last_line() );
number_list.add_entry( this_entry );
}
else
{
error_log errlog;
errlog.log_error( std::string( "Invalid data entry: " ) + last_line() );
}
}
void
number_array_list_parser_2::parse_last_line_as_end_of_historical_data (void)
{
REQUIRE (last_line () == historical_data_terminator);
cout << "Historical data read; "
<< number_list.entry_count() << " entries, "
<< number_list.field_count() << " fields.\n";
state = Parsing_commands;
}
number_array_list_parser_2::number_array_list_parser_2 (void)
{
make();
}
bool
number_array_list_parser_2::last_line_is_valid_historical_data( void ) const
{
const std::vector< std::string > substrings = split_string( last_line(), "," );
bool Result = true;
//make sure we have a valid field count for the number_array_list
if ( ( number_list.entry_count() > 0 ) && ( substrings.size() != number_list.field_count() ) )
{
Result = false;
}
//...and that each substring can be interpreted as a double
for ( int i = 0; i < substrings.size(); ++i )
{
if ( !is_double( substrings[ i ] ) )
{
Result = false;
}
}
return Result;
}
std::vector< std::string >
number_array_list_parser_2::split_string( const std::string& string_to_split, const std::string& separator )
{
std::vector< std::string > Result;
const std::string prefix = string_before_separator( string_to_split, separator );
const std::string remainder = string_after_separator( string_to_split, separator );
Result.push_back( prefix );
if ( remainder.size() > 0 )
{
const std::vector< std::string > split_remainder = split_string( remainder, separator );
Result.insert( Result.end(), split_remainder.begin(), split_remainder.end() );
}
return Result;
}
std::string
number_array_list_parser_2::string_before_separator( const std::string& string_to_split, const std::string& separator )
{
const std::string::size_type separator_position = string_to_split.find( separator );
std::string Result;
if ( separator_position == string_to_split.npos )
{
//not found; result is entire string
Result = string_to_split;
}
else
{
Result = string_to_split.substr( 0, separator_position );
}
return Result;
}
std::string
number_array_list_parser_2::string_after_separator( const std::string& string_to_split, const std::string& separator )
{
const std::string::size_type separator_position = string_to_split.find( separator );
std::string Result;
if ( separator_position == string_to_split.npos )
{
//not found; result is empty
Result = "";
}
else
{
Result = string_to_split.substr( separator_position + separator.size(), string_to_split.size() );
}
return Result;
}
std::vector< double >
number_array_list_parser_2::numbers_from_string( const std::string& string_to_split )
{
const std::vector< std::string > number_strings = split_string( string_to_split, "," );
std::vector< double > Result;
for ( std::vector< std::string >::const_iterator iter = number_strings.begin(); iter != number_strings.end(); ++iter )
{
CHECK( is_double( *iter ) );
const double new_value = double_from_string( *iter );
Result.push_back( new_value );
}
return Result;
}
void
number_array_list_parser_2::parse_last_line_as_command( void )
{
std::string command = string_before_separator( last_line(), " " );
std::vector< string > arguments = split_string( string_after_separator( last_line(), " " ), " " );
if ( command == "normality_check_on_series" )
{
print_normalization_check( static_cast< int >( double_from_string( arguments[0] ) ) );
}
else
{
cout << "unknown command: " << last_line() << "\n";
}
}
void
number_array_list_parser_2::print_normalization_check( int field_index )
{
cout << "Normalization check on series: " << field_index << "\n";
cout << "Q: " << number_list.chi_squared_q( field_index ) << "\n";
cout << "(1-p): " << number_list.chi_squared_1_minus_p( field_index ) << "\n";
}
/*
*/ |
main.cpp |
class ADDER
inherit
SINGLE_VARIABLE_FUNCTION
redefine
at
creation
make
feature {NONE}
addend : DOUBLE
--number added to each argument in at
make( new_addend : DOUBLE ) is
--create with given addend
do
addend := new_addend
end
feature {ANY}
at( x : DOUBLE ) : DOUBLE is
--x + addend
do
Result := x + addend
end
end
|
class CHI_SQUARED_BASE
--base calculation for the chi-squared distribution
inherit
SINGLE_VARIABLE_FUNCTION
redefine
at;
creation
make
feature {ANY}
n : INTEGER
--sample size of the distribution
make( new_n : INTEGER ) is
--creation, setting the sample size
require
new_n > 0
do
n := new_n;
end
base : DOUBLE is
--base of the equation
local
gamma : GAMMA_FUNCTION
do
!!gamma
Result := 1 / ( ( 2 ^ n ).sqrt * gamma.at( n.to_double / 2.0 ) )
end
at( x : DOUBLE ) : DOUBLE is
--value of the function at x
do
Result := base * ( ( x ^ (n-2) ).sqrt ) * ( -x/2.0 ).exp
end
end
|
class CHI_SQUARED_INTEGRAL
inherit
SINGLE_VARIABLE_FUNCTION
redefine
at;
creation
make
feature {ANY}
n : INTEGER
--size of the distribution, in samples
make( new_n : INTEGER ) is
--creation
require
new_n > 0
do
n := new_n
end
at( x : DOUBLE ) : DOUBLE is
--integral evaluated from zero to the given number
require
x > 0.0
local
homer : SIMPSON_INTEGRATOR
chi_base : CHI_SQUARED_BASE
do
!!homer.make
!!chi_base.make(n)
Result := homer.integral( chi_base, 0, x )
end
end
|
class NORMAL_DISTRIBUTION_INVERSE
--inverse of the normal distribution, used to make segment tables
--for normalization fit
inherit
SINGLE_VARIABLE_FUNCTION
redefine
at;
feature { ANY }
at( x : DOUBLE ) : DOUBLE is
--inverse of the normal distribution
local
target : DOUBLE
last_error : DOUBLE
last_result : DOUBLE
this_result : DOUBLE
has_tried_once : BOOLEAN
normal : NORMAL_DISTRIBUTION_INTEGRAL
error_margin : DOUBLE
do
from
target := x;
last_error := 0;
last_result := 0;
this_result := 0;
Result := 0
has_tried_once := false
error_margin := 0.0000001
!!normal.make
until
has_tried_once and error_margin > last_error
loop
last_result := this_result
this_result := normal.at( Result );
last_error := (this_result - target).abs
if ( last_error > error_margin ) then
Result := next_guess( Result, last_result, target )
end
has_tried_once := true;
end
end
next_guess( arg, last_result, target : DOUBLE ) : DOUBLE is
--next guess in the iterative scheme of things
do
Result := arg + ( target - last_result )
end
end
|
class NORMALIZER
inherit
SINGLE_VARIABLE_FUNCTION
redefine
at
creation
make
feature { NONE }
data_mean : DOUBLE
standard_deviation : DOUBLE
make( new_data_mean, new_standard_deviation : DOUBLE ) is
--create with given feature values
do
data_mean := new_data_mean
standard_deviation := new_standard_deviation
end
feature {ANY}
at( x : DOUBLE ) : DOUBLE is
--(x-mean)/standard_dev
do
Result := ( x - data_mean ) / standard_deviation
end
end
|
class NUMBER_ARRAY_LIST
inherit
LINKED_LIST[ ARRAY[ DOUBLE ] ]
redefine
make;
creation
make, make_from_entry, make_from_list
feature { ANY }
field_count : INTEGER
--number of fields allowed in an entry
head : like item is
--first item
require
entry_count >= 1
do
Result := first;
end
tail : like Current is
--all items after the first
do
!!Result.make
if ( entry_count > 1 ) then
Result := slice( lower + 1, upper );
Result.set_field_count( field_count )
end --if
ensure
( entry_count > 1 ) implies Result.entry_count = entry_count - 1 and Result.field_count = field_count
end
suffixed_by_list( rhs: like Current ) : like Current is
--the list, suffixed by another list
local
i : INTEGER
do
!!Result.make_from_list( Current );
from
i := rhs.lower
until
not rhs.valid_index ( i )
loop
Result.add_entry( rhs.item( i ) );
i := i + 1
end --from
ensure
Result.entry_count = entry_count + rhs.entry_count
end
make_from_entry( entry : like item ) is
--clear the list, then add the entry
do
make
add_entry( entry )
ensure
head.is_equal( entry )
field_count = entry.count
end
make is
--clear the list
do
Precursor
field_count := -1
ensure
entry_count = 0
field_count = -1
end
make_from_list( rhs: like Current ) is
--clear the list, setting it equal to another list
do
from_collection( rhs )
field_count := rhs.field_count
end
sum_by_field( field_index : INTEGER ) : DOUBLE is
--the sum of a given field over all entries
require
is_valid_field_index( field_index )
do
if entry_count = 0 then
Result := 0
else
Result := head.item( field_index ) + tail.sum_by_field( field_index )
end
end
mean_by_field( field_index : INTEGER ) : DOUBLE is
--the mean of a given field over all entries
require
is_valid_field_index( field_index )
entry_count >= 1
do
Result := sum_by_field( field_index ) / entry_count.to_double
end
entry_count : INTEGER is
--the number of entries
do
Result := count
end
add_entry( new_entry : like item ) is
-- adds an entry to the end of the list
require
is_valid_entry( new_entry )
do
if entry_count = 0 then
set_field_count( new_entry.count )
end
add_last( new_entry )
ensure
entry_count = old entry_count + 1
end
mapped_to( field_index : INTEGER; f: SINGLE_VARIABLE_FUNCTION ) : like Current is
--the elements, with the given field mapped to f
require
is_valid_field_index( field_index )
local
new_entry : like item
do
!!Result.make
if entry_count > 0 then
new_entry := head.twin
new_entry.put( f.at( head.item(field_index) ), field_index )
Result.add_entry( new_entry );
Result := Result.suffixed_by_list( tail.mapped_to( field_index, f ) )
end -- if
ensure
Result.entry_count = entry_count
end
multiplied_by_list( field_index : INTEGER; rhs : like Current ) : like Current is
--the elements, with the given field multiplied by the
--corresponding field in rhs
require
entry_count = rhs.entry_count
field_count = rhs.field_count
is_valid_field_index( field_index )
local
new_entry : like item
do
!!Result.make
if entry_count > 0 then
new_entry := head
new_entry.put( head.item( field_index ) * rhs.head.item( field_index ), field_index )
Result.add_entry( new_entry )
Result := Result.suffixed_by_list( tail.multiplied_by_list( rhs.tail ) )
end
ensure
Result.entry_count = entry_count
end
sorted_by( field_index : INTEGER ) : like Current is
--the list, sorted by the given field
require
is_valid_field_index( field_index )
do
if is_sorted_by( field_index ) then
!!Result.make_from_list( Current )
else
!!Result.make_from_list( tail.items_less_than( field_index, head.item( field_index ) ).sorted_by( field_index ).
suffixed_by_item( head ).
suffixed_by_list( tail.items_greater_than_or_equal_to( field_index, head.item( field_index ) ).sorted_by( field_index ) ) )
end -- if
ensure
Result.entry_count = entry_count
end
is_valid_entry( entry : like item ) : BOOLEAN is
--whether entry is a valid entry
do
if ( entry_count = 0 and entry.count > 0 ) or ( entry.count = field_count and entry.lower = 1 ) then
Result := true
else
Result := false
end
end
items_less_than( field_index : INTEGER; value : DOUBLE ) : like Current is
--list of items less than the given value in the given field
require
is_valid_field_index( field_index )
local
i : INTEGER
do
!!Result.make
from
i := lower
until
not valid_index( i )
loop
if not( item( i ).item( field_index ) >= value ) then
Result.add_entry( item( i ) )
end
i := i + 1
end
ensure
Result.entry_count <= entry_count
end
items_greater_than_or_equal_to( field_index : INTEGER; value : DOUBLE ) : like Current is
--list of items greater than or equal to the given value in the
--given field
require
is_valid_field_index( field_index )
local
i : INTEGER
do
!!Result.make
from
i := lower
until
not valid_index( i )
loop
if item(i).item( field_index ) >= value then
Result.add_entry( item( i ) )
end
i := i + 1
end
ensure
Result.entry_count <= entry_count
end
suffixed_by_item( entry : like item ) : like Current is
--the list, suffixed by a single item
require
is_valid_entry( entry )
do
!!Result.make_from_list( Current )
Result.add_entry( entry )
ensure
Result.entry_count = entry_count + 1
end
set_field_count( new_field_count : INTEGER ) is
--sets the field count
require
entry_count = 0 or ( entry_count > 0 implies head.count = new_field_count )
do
field_count := new_field_count
ensure
field_count = new_field_count
end
is_valid_field_index( field_index : INTEGER ) : BOOLEAN is
--whether the given field index is valid
do
if entry_count = 0 then
Result := true
else
Result := ( ( 1 <= field_index ) and ( field_index <= field_count ) )
end
end
is_sorted_by( field_index : INTEGER ) : BOOLEAN is
--whether the list is sorted by the given field index
require
is_valid_field_index( field_index )
do
Result := true
if entry_count > 1 then
Result := ( head.item( field_index ) < tail.head.item( field_index ) )
and tail.is_sorted_by( field_index )
end
end
feature {ANY}
--chi-squared distribution calcs
variance( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
require
is_valid_field_index( field_index )
is_full_population implies entry_count > 0
( not is_full_population ) implies entry_count > 1
local
divisor : DOUBLE
adder : ADDER
square : SQUARE
do
if is_full_population then
divisor := entry_count.to_double
else
divisor := (entry_count - 1).to_double
end
!!adder.make( - mean_by_field( field_index ) )
!!square
Result := ( mapped_to( field_index, adder ).
mapped_to( field_index, square ).
sum_by_field( field_index ) ) / divisor
end
standard_deviation( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
require
is_valid_field_index( field_index )
do
Result := variance( field_index, is_full_population ).sqrt
end
normalized_series( field_index : INTEGER ) : like Current is
--one series of the table, "normalized" into standard deviations
require
is_valid_field_index( field_index )
local
normalizer : NORMALIZER
do
!!normalizer.make( mean_by_field( field_index ), standard_deviation( field_index, false ) )
Result := mapped_to( field_index, normalizer ).select_series( field_index )
ensure
Result.entry_count = entry_count
end
normalization_s_value : INTEGER is
--number of segments in a normalization table
require
entry_count.divisible( 5 )
do
Result := entry_count // 5
end
normalized_series_table( field_index : INTEGER ) : like Current is
--a normalized series table, used for error-checking and used
--in the chi-squared normalization test
require
entry_count > 0
is_valid_field_index( field_index )
local
sorted_list : like Current
sorted_normalized_list : like Current
i : INTEGER
new_entry : ARRAY[ DOUBLE ]
do
sorted_list := sorted_by( field_index )
sorted_normalized_list := sorted_list.normalized_series( field_index )
check
sorted_list.entry_count = entry_count
sorted_normalized_list.entry_count = entry_count
end
from
i := sorted_list.lower
!!Result.make
until
not sorted_list.valid_index( i )
loop
!!new_entry.make( 1, 0 )
new_entry.add_last( i.to_double )
new_entry.add_last( i.to_double / entry_count.to_double )
new_entry.add_last( sorted_list.item( i ).item( field_index ) )
new_entry.add_last( sorted_normalized_list.item(i).first )
Result.add_entry( new_entry )
i := i + 1
end
end
normal_distribution_segments( s : INTEGER ) : like Current is
--segments of the normal distribution; see [Humphrey95] for use
require
s > 0
local
i : INTEGER
new_entry : ARRAY[ DOUBLE ]
norm_inverse : NORMAL_DISTRIBUTION_INVERSE
do
from
i := 1
!!Result.make
!!norm_inverse
until
i > s
loop
!!new_entry.make( 1, 0 )
if i = 1 then
new_entry.add_last( -1000.0 )
else
new_entry.add_last( norm_inverse.at( ( i - 1 ).to_double * ( 1.0 / s.to_double ) ) )
end
if i = s then
new_entry.add_last( 1000.0 )
else
new_entry.add_last( norm_inverse.at( ( i.to_double * 1.0 / s.to_double ) ) )
end
Result.add_entry( new_entry )
i := i + 1
end
end
chi_squared_table( field_index : integer ) : like Current is
-- chi-squared table, used to calculate the chi-squared distribution
require
is_valid_field_index( field_index )
local
norm_segments : like Current
norm_table : like Current
i : INTEGER
new_entry : ARRAY[ DOUBLE ]
do
from
norm_segments := normal_distribution_segments( normalization_s_value )
norm_table := normalized_series_table( field_index )
!!Result.make
i := norm_segments.lower
until
not norm_segments.valid_index( i )
loop
!!new_entry.make( 1, 0 )
new_entry.add_last( i.to_double )
new_entry.add_last( norm_segments.item( i ).item( 1 ) )
new_entry.add_last( norm_segments.item( i ).item( 2 ) )
new_entry.add_last( norm_table.entry_count.to_double / norm_table.normalization_s_value.to_double )
new_entry.add_last( norm_table.items_in_range( 4, new_entry.item(2), new_entry.item(3) ))
new_entry.add_last( ( new_entry.item(4) - new_entry.item(5) ) ^ 2 )
new_entry.add_last( new_entry.item(6) / new_entry.item(4) )
Result.add_entry( new_entry )
i := i + 1
end
end
chi_squared_q( field_index : INTEGER ) : DOUBLE is
--"Q" value of chi-squared normalization test
do
Result := chi_squared_table( field_index ).sum_by_field( 7 )
end
chi_squared_1_minus_p( field_index : INTEGER ) : DOUBLE is
--1-p value of chi-squared normalization test
local
chi_squared_integral : CHI_SQUARED_INTEGRAL
do
!!chi_squared_integral.make( normalization_s_value - 1 )
Result := 1 - chi_squared_integral.at( chi_squared_q( field_index ) )
end
select_series( field_index : INTEGER ) : like Current is
--the given series, extracted as a separate list
require
is_valid_field_index( field_index )
local
new_entry : ARRAY[ DOUBLE ]
do
if ( entry_count = 0 ) then
!!Result.make
else
!!new_entry.make( 1, 0 )
new_entry.add_last( head.item( field_index ) )
!!Result.make_from_entry( new_entry )
if entry_count > 1 then
Result := Result.suffixed_by_list( tail.select_series( field_index ) )
end
end
end
items_in_range( field_index : INTEGER; lower_limit, upper_limit : DOUBLE ) : INTEGER is
--the items from the given field which fit in (lower limit < item <= upper_limit)
require
is_valid_field_index( field_index )
do
if entry_count > 0 and head.item(field_index) > lower_limit and upper_limit >= head.item( field_index ) then
Result := 1
end
if entry_count > 1 then
Result := Result + tail.items_in_range( field_index, lower_limit, upper_limit )
end
end
entry_as_string( entry : like item ) : STRING is
--entry as a string, ie "( 1, 2, 3 )"
local
i : INTEGER
do
!!Result.make_from_string( "( " )
from
i := entry.lower
until
not entry.valid_index( i )
loop
entry.item( i ).append_in( Result )
if entry.valid_index( i + 1 ) then
Result.append_string( ", " )
end
i := i + 1
end
Result.append_string( " )" )
end
table_as_string : STRING is
--table, as a set of entry_as_string lines
do
if entry_count > 0 then
Result := entry_as_string( head ) + "%N"
if entry_count > 1 then
Result.append_string( tail.table_as_string )
end
end
end
end
|
class NUMBER_ARRAY_LIST_PARSER_2
--reads a list of number pairs, and performs linear regression analysis
inherit
SIMPLE_INPUT_PARSER
redefine parse_last_line, transformed_line
end;
creation {ANY}
make
feature {ANY}
inline_comment_begin: STRING is "--";
string_stripped_of_comment(string: STRING): STRING is
--strip the string of any comment
local
comment_index: INTEGER;
do
if string.has_string(inline_comment_begin) then
comment_index := string.index_of_string(inline_comment_begin);
if comment_index = 1 then
!!Result.make_from_string( "" );
else
Result := string.substring(1,comment_index - 1);
end;
else
Result := string.twin;
end;
end -- string_stripped_of_comment
string_stripped_of_whitespace(string: STRING): STRING is
--strip string of whitespace
do
Result := string.twin;
Result.left_adjust;
Result.right_adjust;
end -- string_stripped_of_whitespace
transformed_line(string: STRING): STRING is
--strip comments and whitespace from parseable line
do
Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
end -- transformed_line
number_list: NUMBER_ARRAY_LIST;
state : INTEGER
Parsing_historical_data : INTEGER is unique
Parsing_commands : INTEGER is unique
Command_normality_check_on_series : STRING is
once
Result := "normality_check_on_series"
end
historical_data_terminator: STRING is "stop";
double_from_string(string: STRING): DOUBLE is
require
string.is_double or string.is_integer;
do
if string.is_double then
Result := string.to_double;
elseif string.is_integer then
Result := string.to_integer.to_double;
end;
end -- double_from_string
feature {ANY} --parsing
reset is
--resets the parser and makes it ready to go again
do
state := Parsing_historical_data;
number_list.make;
end -- reset
make is
do
!!number_list.make;
reset;
end -- make
parse_last_line_as_historical_data is
--interpret last_line as a pair of comma-separated values
local
error_log: ERROR_LOG;
this_line_numbers : ARRAY[ DOUBLE ]
do
!!error_log.make
if last_line_is_valid_historical_data then
this_line_numbers := numbers_from_string( last_line )
number_list.add_entry( this_line_numbers )
else
error_log.log_error( "Invalid historical data: " + last_line + "%N" )
end
end
parse_last_line_as_end_of_historical_data is
--interpret last line as the end of historical data
require
last_line.compare(historical_data_terminator) = 0;
local
i : INTEGER
do
state := Parsing_commands
std_output.put_string( "Historical data read; " + number_list.entry_count.to_string +
"items read%N" )
end -- parse_last_line_as_end_of_historical_data
parse_last_line_as_command is
--interpret the last line as a command
local
command : STRING
arguments : ARRAY[ STRING ]
do
command := string_before_separator( last_line, " " )
arguments := split_string( string_after_separator( last_line, " ", ), " " )
if command.is_equal( Command_normality_check_on_series ) then
print_normalization_check_on_series( arguments.first.to_integer + 1)
else
std_output.put_string( "Unrecognized command string: " + last_line )
end
end
parse_last_line is
--parse the last line according to state
do
if not last_line.empty then
if state = Parsing_historical_data then
if last_line.compare(historical_data_terminator) = 0 then
parse_last_line_as_end_of_historical_data;
else
parse_last_line_as_historical_data;
end;
else
parse_last_line_as_command
end
end;
end -- parse_last_line
last_line_is_valid_historical_data : BOOLEAN is
--whether last line is valid historical data
local
substrings : ARRAY[ STRING ]
i : INTEGER
do
substrings := split_string( last_line, "," )
Result := true
if ( number_list.entry_count > 0 and substrings.count /= number_list.field_count ) then
Result := false;
end
--check and see if each substring is convertible to a double
from
i := substrings.lower
until
not substrings.valid_index( i )
loop
if not ( substrings.item(i).is_double or substrings.item(i).is_integer ) then
Result := false;
end
i := i + 1
end
end
split_string( string_to_split, separator : STRING ) : ARRAY[ STRING ] is
--a list of components of a string, separated by the given
--separator, ie split_string( "1,2,3", "," ) = [ "1", "2", "3" ]
local
prior_to_separator : STRING
remainder : STRING
split_remainder : ARRAY[ STRING ]
i : INTEGER
do
prior_to_separator := string_before_separator( string_to_split, separator )
remainder := string_after_separator( string_to_split, separator )
!!Result.make( 1, 0 )
Result.add_last( prior_to_separator )
if remainder.count > 0 then
split_remainder := split_string( remainder, separator )
from
i := split_remainder.lower
until
not split_remainder.valid_index( i )
loop
Result.add_last( split_remainder.item( i ) )
i := i + 1
end
end
end
string_before_separator( string_to_split, separator : STRING ) : STRING is
--the part of a string which comes before the separator, or
--the whole string if it's not found
local
separator_index : INTEGER
do
separator_index := string_to_split.substring_index( separator, 1 )
if ( separator_index = 0 ) then
--not found; copy whole string
Result := string_to_split.twin
else
Result := string_to_split.substring( 1, separator_index - 1 )
end
end
string_after_separator( string_to_split, separator : STRING ) : STRING is
--the part of the string after the separator,
local
separator_index : INTEGER
do
separator_index := string_to_split.substring_index( separator, 1 )
if ( separator_index = 0 ) then
--not found; result is empty
!!Result.make_from_string( "" )
else
Result := string_to_split.substring( separator_index + separator.count, string_to_split.count )
end
end
numbers_from_string( string_to_split : STRING ) : ARRAY[ DOUBLE ] is
--an array of numbers, from a string of numbers separated by commas
local
number_strings : ARRAY[ STRING ]
i : INTEGER
do
!!Result.make( 1, 0 )
number_strings := split_string( string_to_split, "," )
from
i := number_strings.lower
until
not number_strings.valid_index( i )
loop
check
number_strings.item(i).is_double or number_strings.item(i).is_integer
end
Result.add_last( double_from_string( number_strings.item( i ) ) )
i := i + 1
end
end
print_normalization_check_on_series( field_index : INTEGER ) is
do
std_output.put_string( "Normalization check on series " )
std_output.put_integer( field_index )
--std_error.put_string( number_list.sorted_by( field_index ).table_as_string + "%N" )
--std_error.put_string( number_list.normalized_series_table( field_index ).table_as_string )
std_output.put_string( "%NQ: " )
std_output.put_double( number_list.chi_squared_q( field_index ) )
std_output.put_string( "%N(1-p): ")
std_output.put_double( number_list.chi_squared_1_minus_p( field_index ) )
std_output.put_new_line
end
end -- class NUMBER_ARRAY_LIST_PARSER_2
|
class MAIN
creation {ANY}
make
feature {ANY}
make is
local
parser: NUMBER_ARRAY_LIST_PARSER_2;
do
!!parser.make;
parser.set_input(io);
parser.parse_until_eof;
end -- make
end -- MAIN |
I need to get better at code reviews. On both the Eiffel and C++ side, I missed several obvious problems-- on the Eiffel side, a problem which caused a great deal of delay (I had forgotten that Eiffel programs only copy by reference, and I had copied something and changed it, changing the original-- and completely baffling me for some time-- something to add to my code review checklist!)
Once again, the compiler caught many sneaky items. I'm impressed by the error-checking in the Eiffel compiler, particularly with regard to type compliance-- it caught several small things which I just didn't notice in the code review (including the fact that one of my comments was mismatched!). Very nice.
My strategy of reproducing the calculation tables as in the text paid good dividends here, as I was able to print out the tables and search for problems. It worked very well, and though I had several problems with the programs themselves, they were relatively easy to find and fix due to the availability of interim values, etc.
Table 10-2. D15: Test Results Format -- program 9a
| Test | Expected Result | Actual Result - C++ | Actual Result - Eiffel |
| Table D14 | |||
| Q | 34.4 | 34.4 | 34.400000 |
| 1-p | 7.60*10-5 | 7.61098e-05 | 0.000079 |
I'm curious about the difference in calculation between the C++ and Eiffel programs with regard t othe 1-p entry; the values are extremely close, and I can't find significant differences in the calculations, but something seems a touch off kilter. I can't figure out if it's the level of precision involved, a difference in standard libraries, or what.
Table 10-3. Project Plan Summary
| Student: | Victor B. Putz | Date: | 000201 |
| Program: | Normalization | Program# | 9A |
| Instructor: | Wells | Language: | C++ |
| Summary | Plan | Actual | To date |
| Loc/Hour | 46 | 45 | 46 |
| Planned time | 285 | 943 | |
| Actual time | 342 | 1248 | |
| CPI (cost/performance index) | 0.76 | ||
| %reused | 47 | 69 | 46 |
| Test Defects/KLOC | 31 | 27 | 30.6 |
| Total Defects/KLOC | 141 | 135 | 140.4 |
| Yield (defects before test/total defects) | 78 | 80 | 78 |
| % Appraisal COQ | 5 | 8.5 | 5.58 |
| % Failure COQ | 29.8 | 19.6 | 28.32 |
| COQ A/F Ratio | 0.16 | 0.43 | 0.20 |
| Program Size | Plan | Actual | To date |
| Base | 20 | 20 | |
| Deleted | 0 | 18 | |
| Modified | 1 | 1 | |
| Added | 237 | 257 | |
| Reused | 231 | 565 | 1698 |
| Total New and Changed | 239 | 258 | 1731 |
| Total LOC | 489 | 824 | 3672 |
| Upper Prediction Interval (70%) | 326 | ||
| Lower Prediction Interval (70%) | 152 |
| Time in Phase (min): | Plan | Actual | To Date | To Date% |
| Planning | 54 | 77 | 449 | 20 |
| Design | 37 | 67 | 314 | 14 |
| Design Review | 6 | 13 | 53 | 2 |
| Code | 74 | 78 | 568 | 25 |
| Code Review | 9 | 16 | 73 | 3 |
| Compile | 17 | 18 | 132 | 6 |
| Test | 68 | 49 | 507 | 22 |
| Postmortem | 20 | 24 | 160 | 7 |
| Total | 285 | 342 | 2256 | 100 |
| Total Time UPI (70%) | 322 | |||
| Total Time LPI (70%) | 249 | |||
| Defects Injected | Actual | To Date | To Date % | |
| Plan | 0 | 0 | 0 | |
| Design | 10 | 12 | 77 | 32 |
| Design Review | 0 | 0 | 0 | 0 |
| Code | 22 | 23 | 160 | 66 |
| Code Review | 0 | 0 | 0 | 0 |
| Compile | 1 | 0 | 3 | 1 |
| Test | 1 | 0 | 3 | 1 |
| Total development | 34 | 35 | 243 | 100 |
| Defects Removed | Actual | To Date | To Date % | |
| Planning | 0 | 0 | 0 | |
| Design | 0 | 0 | 0 | |
| Design Review | 1 | 6 | 15 | 6 |
| Code | 7 | 1 | 45 | 19 |
| Code Review | 3 | 8 | 23 | 9 |
| Compile | 15 | 13 | 107 | 44 |
| Test | 8 | 7 | 53 | 22 |
| Total development | 34 | 35 | 243 | 100 |
| After Development | 0 | 0 | 0 | |
| Defect Removal Efficiency | Plan | Actual | To Date | |
| Defects/Hour - Design Review | 13.5 | 27.6 | 22.5 | |
| Defects/Hour - Code Review | 15.8 | 30 | 24.2 | |
| Defects/Hour - Compile | 49.5 | 43.3 | 56.3 | |
| Defects/Hour - Test | 6 | 8.57 | 6.9 | |
| DRL (design review/test) | 2.25 | 3.22 | 2.26 | |
| DRL (code review/test) | 2.63 | 3.5 | 3.5 | |
| DRL (compile/test) | 8.25 | 5.05 | 8.16 |
| Eiffel code/compile/test |
| Time in Phase (min) | Actual | To Date | To Date % |
| Code | 62 | 369 | 50 |
| Code Review | 10 | 40 | 6 |
| Compile | 15 | 133 | 18 |
| Test | 41 | 192 | 26 |
| Total | 128 | 734 | 100 |
| Defects Injected | Actual | To Date | To Date % |
| Design | 1 | 6 | 4 |
| Code | 22 | 138 | 95 |
| Compile | 0 | 0 | 0 |
| Test | 0 | 1 | 1 |
| Total | 23 | 145 | 100 |
| Defects Removed | Actual | To Date | To Date % |
| Code | 0 | 1 | 1 |
| Code Review | 9 | 14 | 11 |
| Compile | 16 | 88 | 60 |
| Test | 5 | 40 | 28 |
| Total | 23 | 145 | 100 |
| Defect Removal Efficiency | Actual | To Date | |
| Defects/Hour - Code Review | 12 | 24 | |
| Defects/Hour - Compile | 64 | 40 | |
| Defects/Hour - Test | 7.3 | 12.5 | |
| DRL (code review/test) | 1.6 | 1.9 | |
| DRL (compile/test) | 8.8 | 3.2 |
Table 10-4. Time Recording Log
| Student: | Victor B. Putz | Date: | 000130 |
| Program: | 9a |
| Start | Stop | Interruption Time | Delta time | Phase | Comments |
| 000130 10:50:04 | 000130 12:07:06 | 0 | 77 | plan | |
| 000130 12:14:58 | 000130 13:25:57 | 4 | 66 | design | |
| 000130 13:59:50 | 000130 14:12:30 | 0 | 12 | design review | |
| 000130 14:24:05 | 000130 15:50:58 | 9 | 77 | code | |
| 000130 16:00:00 | 000130 16:21:52 | 5 | 16 | code review | |
| 000130 16:22:00 | 000130 16:39:45 | 0 | 17 | compile | |
| 000130 16:42:18 | 000130 17:31:05 | 0 | 48 | test | |
| 000130 17:31:29 | 000130 17:55:28 | 0 | 23 | postmortem | |
Table 10-5. Time Recording Log
| Student: | Date: | 000201 | |
| Program: |
| Start | Stop | Interruption Time | Delta time | Phase | Comments |
| 000201 07:56:14 | 000201 09:00:44 | 2 | 62 | code | |
| 000201 09:01:18 | 000201 09:11:07 | 0 | 9 | code review | |
| 000201 09:11:20 | 000201 09:26:12 | 0 | 14 | compile | |
| 000201 09:26:36 | 000201 10:07:25 | 0 | 40 | test | |
Table 10-6. Defect Recording Log
| Student: | Victor B. Putz | Date: | 000130 |
| Program: | 9a |
| Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
| 000130 13:59:52 | ct | ig | design | design review | 1 | missed minor contracts |
| 000130 14:03:53 | ma | ig | design | design review | 0 | didn't increment loop indices |
| 000130 14:05:35 | ct | ig | design | design review | 0 | missed contract for variance |
| 000130 14:07:06 | ct | ig | design | design review | 0 | require standard_deviation /= 0 |
| 000130 14:07:58 | ct | ig | design | design review | 0 | require s > 0 in normal_distribution_segments |
| 000130 14:09:41 | ct | ig | design | design review | 0 | minor contracts |
| 000130 15:44:53 | md | ig | design | code | 0 | forgot to add parse_last_line_as_command |
| 000130 16:02:40 | ct | ig | code | code review | 0 | forgot to put in the require contract for make/constructor |
| 000130 16:05:07 | sy | om | code | code review | 0 | forgot to #include normal_distribution_integral |
| 000130 16:06:00 | sy | om | code | code review | 0 | forgot to make last_guess const in implementation |
| 000130 16:13:33 | sy | om | code | code review | 0 | forgot to return return value! |
| 000130 16:17:23 | wc | om | code | code review | 0 | was testing the integral of the normal distribution as 100, rather than 1 |
| 000130 16:19:06 | sy | ty | code | code review | 0 | forgot parentheses around no-argument operation |
| 000130 16:19:53 | sy | ty | code | code review | 0 | forgot parentheses around no-argument operation |
| 000130 16:20:27 | sy | om | code | code review | 0 | forgot to return return value! |
| 000130 16:23:13 | sy | ig | design | compile | 6 | er... forgot that inheriting and mixing types gets very gross; combined number_array_list and number_array_list_2 into one |
| 000130 16:30:48 | wn | ty | code | compile | 0 | misspelled normal_distribution_inverse |
| 000130 16:31:54 | wn | cm | code | compile | 0 | used normalized_series_table instead of norm_table |
| 000130 16:33:25 | sy | cm | code | compile | 0 | didn't make the at function const as required |
| 000130 16:34:07 | sy | cm | code | compile | 0 | missed parentheses on no-arg function |
| 000130 16:34:40 | sy | cm | code | compile | 0 | forgot return type for select_series |
| 000130 16:35:05 | sy | ty | code | compile | 0 | declared Result with type::Result |
| 000130 16:35:34 | sy | om | code | compile | 0 | forgot const in implementation |
| 000130 16:35:57 | wn | cm | code | compile | 0 | er.. didn't type the correct name for the argument |
| 000130 16:36:17 | sy | om | code | compile | 0 | forgot parentheses for no-arg function |
| 000130 16:36:58 | sy | cm | code | compile | 0 | mistyped integer instead of int |
| 000130 16:37:57 | sy | om | code | compile | 0 | forgot to #include gamma_function.h |
| 000130 16:38:22 | sy | om | code | compile | 0 | forgot to #include simpson_integrator.h |
| 000130 16:45:22 | wc | om | code | test | 0 | had to add an exception that if the first guess was in, don't try a second guess! |
| 000130 16:46:46 | ct | om | design | test | 0 | forgot contract in normalized_series and select_series |
| 000130 16:52:18 | ct | om | design | test | 0 | forgot ensure contract in normalized_series_table |
| 000130 16:53:01 | wn | ty | code | test | 1 | was setting an iterator to end instead of begin for startup |
| 000130 16:56:12 | we | ig | design | test | 7 | was using field_index as index of normalized series, instead of zero index |
| 000130 17:07:17 | sy | ig | code | test | 14 | dangit-- C++ was doing silent conversions from int to double without thinking |
| 000130 17:21:45 | wa | ig | design | test | 8 | was not setting the correct degrees of freedom for chi2 integral |
Table 10-7. Defect Recording Log
| Student: | Date: | 000201 | |
| Program: |
| Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
| 000201 09:04:25 | ct | om | code | code review | 0 | forgot to put in comment |
| 000201 09:07:29 | wn | cm | code | code review | 0 | forgot to change "Parsing_prediction_data" to "Parsing_commands" |
| 000201 09:11:30 | sy | ig | code | compile | 0 | forgot that comments which end a class must be correct! |
| 000201 09:12:08 | sy | cm | code | compile | 0 | separated arguments of different type with comma rather than semicolon |
| 000201 09:12:44 | sy | om | code | compile | 0 | accidentally put contract under the local block |
| 000201 09:13:20 | wn | om | code | compile | 0 | forgot to qualify item with feature call |
| 000201 09:14:05 | sy | ty | code | compile | 0 | forgot colon before return value |
| 000201 09:15:14 | sy | ty | code | compile | 0 | forgot "is" after feature declaration |
| 000201 09:15:39 | wa | cm | code | compile | 1 | forgot to take out program 8a code to print the sorted lists |
| 000201 09:17:11 | wn | ig | code | compile | 0 | used "equals" instead of "is_equal" for string equality test |
| 000201 09:18:29 | wn | cm | code | compile | 0 | used "lower" instead of "first" for first argument of an array |
| 000201 09:19:12 | wn | cm | code | compile | 0 | typed "entry" instead of "new_entry" |
| 000201 09:19:49 | wn | cm | code | compile | 0 | typed "i" rather than "field_index" |
| 000201 09:20:12 | wn | cm | code | compile | 0 | typed "n" rather than "entry_count" |
| 000201 09:21:16 | wn | ig | code | compile | 0 | apparently "/" is used to produce a double from two integers; // is what I wanted |
| 000201 09:22:04 | sy | ig | code | compile | 0 | used "is" in local block |
| 000201 09:23:11 | wt | ig | code | compile | 0 | urg... returned a double rather than an integer |
| 000201 09:25:52 | mc | cm | code | compile | 0 | was not instantiating my normal_distribution_inverse object |
| 000201 09:27:00 | wc | cm | design | test | 1 | wrong value (100 instead of 1) in normal_distribution_segments |
| 000201 09:29:20 | ct | ty | code | test | 2 | was doing test on old value of n, not new value! |
| 000201 09:33:55 | ma | om | code | test | 0 | forgot to update i in loop! |
| 000201 09:35:00 | wa | ig | code | test | 7 | |
| 000201 09:45:46 | wa | ig | code | test | 20 | Okay, this hurt-- forgot that Eiffel doesn't copy things by default, so I was modifying the original list! |