Path: utzoo!attcan!uunet!snorkelwacker!bu.edu!orc!inews!iwarp.intel.com!psueea!eecs!warren From: warren@eecs.cs.pdx.edu (Warren Harrison) Newsgroups: comp.software-eng Subject: Re: How do you measure code quality? Message-ID: <3097@psueea.UUCP> Date: 5 Jul 90 20:16:54 GMT References: <10865@netcom.UUCP> <11113@netcom.UUCP> <926@gistdev.gist.com> <3182@stl.stc.co.uk> Sender: news@psueea.UUCP Reply-To: warren@eecs.UUCP (Warren Harrison) Distribution: comp Organization: Portland State University, Portland, OR Lines: 71 In article <3182@stl.stc.co.uk> "Tom Thomson" writes: >In article <926@gistdev.gist.com> flint@gistdev.gist.com (Flint Pellett) writes: >Some published work indicates no correlation at all between McCabe complexity >measure and rate of change of code after release; while there's a very strong >correlation between rate of change after release and source code size. >[Barbara Kitchenham published some stuff on this years ago, I can't remember > the exact reference]. This isn't surprising. Rate of change after code release has very little to do with (sorry, but it seems like the most appropriate term) "code quality". It is well known that under 20% of maintenance (ie, code changes after release) are due to things other than bugs (see Lientz & Swanson's research for the exact percentages) - primarily new functionality and adaptations to new environments (aka "porting"). We recently looked at about 250,000 lines of embedded avionics software from several families of Navay attack aircraft. Our percentages agreed with the L & S study. Not surprisingly, we had minimal correlations between metrics and change. >For an economic measure of quality, post-release rate-of-change is pretty good: >change arises because there were bugs; debugging after release is expensive. Not true for most software (see above)! >[Obviously, for something that is going through a planned series of releases >with planned facility enhancements, not all change is to do with bugs, but the >change that is is bad news.] Even isolating the changes due to bugs does not give you a suitable basis for evaluating code metrics, since large percentages of bugs are typically due to Sepcification and/or design errors (in one of our studies we found about 25% of teh recorded bugs during testing were due to coding - the rest were put in at spec or design stage). Obviouslyu you can have the best code in the world but if the design or spec is wrong, it's still a bug. >So code size is a better (very much better) measure of code quality than >McCabe Complexity - - the smaller the source code to do a job, the better the >quality. The McCabe measure tells us something about the control flow graph >of the program: so it is completely useless for code not written in a control >flow language (ML, pure Lisp, Prolog, ......); it favours (assigns lower >complexity to) programs written in a style which use jump tables in data areas >instead of case statements in code and compute program addresses by arithmetic >instead of using labels or procedure names, so it's going to tell you that >really awful programs are pretty good. > All metrics should be assumed to be useful only within their specific domain - it's asking a little much for a universal property to be applied to all programming paridigms - consider that people measure expert system performance using LIPS instead of MIPS. In fact, a number of studies have shown that as programs get smaller, their bug rate (ie, bugs per KLOC) increase (Basili's work stands out most in my mind, but others have done this too), so while larger modules have more bugs, they often have fewer bugs per thousand lines of code. >Just as the quote above is not claiming that not every program with a high >McCabe number is a bad one, I'm not claiming that every program with a low >McCabe number is a bad one either; just pointing that the McCabe number is >of very little use for anything. It *can* identify source code that is hard to follow due to explicit flow of control (to evaluate the McCabe metric, only consider code errors that were due to control flow problems). This won't give you the whole picture, but it will give you one part of it. Most metricians recommend that you use a set of metrics to get a handle on the different aspects of the code (sorry again) "quality", just like a physician will tell you about blood pressure, height, weight, cholesterol, etc. The mistake is evaluating the code using *one* number. > I wouldbe happy to send copies of tech reports or reprints of our papers to anyone who is interested. Just send me you US Mail address (we don't have most of them on-line). Warren ========================================================================== Warren Harrison warren@cs.pdx.edu Department of Computer Science 503/725-3108 Portland State University