Path: utzoo!attcan!uunet!snorkelwacker!bu.edu!orc!inews!iwarp.intel.com!psueea!eecs!warren
From: warren@eecs.cs.pdx.edu (Warren Harrison)
Newsgroups: comp.software-eng
Subject: Re: How do you measure code quality?
Message-ID: <3097@psueea.UUCP>
Date: 5 Jul 90 20:16:54 GMT
References: <DON.ALLINGHAM.90Jun22081624@hendrix.FtCollins.NCR.COM> <10865@netcom.UUCP> <CIMSHOP!DAVIDM.90Jun25103758@uunet.UU.NET> <11113@netcom.UUCP> <926@gistdev.gist.com> <3182@stl.stc.co.uk>
Sender: news@psueea.UUCP
Reply-To: warren@eecs.UUCP (Warren Harrison)
Distribution: comp
Organization: Portland State University, Portland, OR
Lines: 71

In article <3182@stl.stc.co.uk> "Tom Thomson" <tom@stl.stc.co.uk> writes:
>In article <926@gistdev.gist.com> flint@gistdev.gist.com (Flint Pellett) writes:
>Some published work indicates no correlation at all between McCabe complexity
>measure and rate of change of code after release;  while there's a very strong
>correlation between rate of change after release and source code size. 
>[Barbara Kitchenham published some stuff on this years ago, I can't remember
> the exact reference].
This isn't surprising. Rate of change after code release has very little
to do with (sorry, but it seems like the most appropriate term) "code quality".
It is well known that under 20% of maintenance (ie, code changes after
release) are due to things other than bugs (see Lientz & Swanson's research
for the exact percentages) - primarily new functionality and adaptations
to new environments (aka "porting"). We recently looked at about 250,000 lines
of embedded avionics software from several families of Navay attack aircraft.
Our percentages agreed with the L & S study. Not surprisingly, we had minimal
correlations between metrics and change.
>For an economic measure of quality, post-release rate-of-change is pretty good:
>change arises because there were bugs; debugging after release is expensive.
Not true for most software (see above)!
>[Obviously, for something that is going through a planned series of releases 
>with planned facility enhancements, not all change is to do with bugs, but the
>change that is is bad news.]
Even isolating the changes due to bugs does not give you a suitable basis for
evaluating code metrics, since large percentages of bugs are typically due to
Sepcification and/or design errors (in one of our studies we found about 25%
of teh recorded bugs during testing were due to coding - the rest were put
in at spec or design stage). Obviouslyu you can have the best code in the
world but if the design or spec is wrong, it's still a bug.
>So code size is a better (very much better) measure of code quality than
>McCabe Complexity - - the smaller the source code to do a job, the better the 
>quality. The McCabe measure tells us something about the control flow graph
>of the program: so it is completely useless for code not written in a control
>flow language (ML, pure Lisp, Prolog, ......); it favours (assigns lower
>complexity to) programs written in a style which use jump tables in data areas
>instead of case statements in code and compute program addresses by arithmetic
>instead of using labels or procedure names, so it's going to tell you that
>really awful programs are pretty good.
> 
All metrics should be assumed to be useful only within their specific
domain - it's asking a little much for a universal property to be applied
to all programming paridigms - consider that people measure expert system
performance using LIPS instead of MIPS. In fact, a number of studies have
shown that as programs get smaller, their bug rate (ie, bugs per KLOC)
increase (Basili's work stands out most in my mind, but others have done
this too), so while larger modules have more bugs, they often have fewer
bugs per thousand lines of code.

>Just as the quote above is not claiming that not every program with a high 
>McCabe number is a bad one, I'm not claiming that every program with a low 
>McCabe number is a bad one either; just pointing that the McCabe number is
>of very little use for anything.
It *can* identify source code that is hard to
follow due to explicit flow of control (to evaluate the McCabe metric,
only consider code errors that were due to control flow problems). This
won't give you the whole picture, but it will give you one part of it.
Most metricians recommend that you use a set of metrics to get a handle
on the different aspects of the code (sorry again) "quality", just like a
physician will tell you about blood pressure, height, weight, cholesterol, etc.
The mistake is evaluating the code using *one* number.
> 
I wouldbe happy to send copies of tech reports or reprints of our papers
to anyone who is interested. Just send me you US Mail address (we don't
have most of them on-line).

Warren


==========================================================================
Warren Harrison                                          warren@cs.pdx.edu
Department of Computer Science                                503/725-3108
Portland State University