Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!mcnc!thorin!coggins!coggins
From: coggins@coggins.cs.unc.edu (Dr. James Coggins)
Newsgroups: comp.software-eng
Subject: Re: Using COCOMO to estimate development schedules
Message-ID: <7468@thorin.cs.unc.edu>
Date: 28 Mar 89 17:51:29 GMT
References: <351@tahoma.UUCP> <1702@spp2.UUCP> <18252@gatech.edu>
Sender: news@thorin.cs.unc.edu
Reply-To: coggins@cs.unc.edu (Dr. James Coggins)
Organization: University Of North Carolina, Chapel Hill
Lines: 61

In article <18252@gatech.edu> shilling@pinto.UUCP (John Shilling) writes:
>I have two questions about COCOMO
>
>1.  Isn't the accuracy of the model limited by an initial guess of
>    DELIVERED lines of source code?  Is there any reason to believe that
>    a guess of delivered lines of source code is any more accurate than
>    simply guessing at the resources required directly?  And just what is a 
>    line     of source code anyway?  Seriously.  Is it delineated by a newline
>    character?  Is it the number of statements?  Does it include comments?

Yes, you need to estimate first a figure denoted KDSI for "thousands
of delivered source instructions (excluding comments)".  By
assumption, size in KDSI is the principal factor driving software
cost.  Now, true, you still have to come up with this number somehow,
usually from experience on similar projects.  But at least you know
WHAT you need to estimate.  It guides your study of other similar
systems to the critical factor you need to learn about in order to
estimate accurately in the future. 

It has been shown that counting "tokens" does not provide a
significant improvement over "lines" and "tokens" requires a parse. 
In counting such lines, several language statements on one line counts
as 1 and a data declaration stretching over 8 lines counts as 8.
You can also automate these things by counting semicolons or using
similar tricks.  Some students of mine wrote a script to count 
C statements, # directives, and macros correctly.  Since the unit
used is THOUSANDS of lines, these noise factors don't matter much.

>2.  In what I have seen on the COCOMO model the weights on the factors 
>    are given to two significant digits past the decimal point.  This raises 
>    all sorts of flags with me.  Back in numerical analysis I learned not to 
>    represent digits     that were below the level of accuracy of the 
>    computation.  Given     the level of uncertainty in the evaluation of 
>    the factors, I have a     hard time believing that the computations are 
>    accurate to two decimal digits (it has to be more than crunching numbers, 
>    it has to mean something).  Does anyone out there have evidence to the
>    contrary?
>
>John J. Shilling
>Internet:	shilling@gatech.edu

You have been trained well.  But note: those figures are used only in
the Detailed COCOMO level where they have some chance of being valid.
In Intermediate COCOMO you just use numbers like .8, 1.2, etc. to
represent qualifiers like "low" "medium" or "high".  The precision
issue you raise was indeed factored into the modelling, and I'm not
going to quibble over whether 1 digit or 2 is "Right".  Now if
detailed COCOMO were presented with 5 digits after the decimal, I
would complain.  As it is, 2 digits after the point is not
unreasonable. 

See Boehm's book, Software Engineering Economics for the full (quite
readable) discussion.  Visit your library and Read More About It.
---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   Old: Algorithms+Data Structures = Programs
UNC-Chapel Hill               New: Objects+Objects=Objects
Chapel Hill, NC 27599-3175    
and NASA Center of Excellence in Space Data and Information Science
---------------------------------------------------------------------