Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uunet!motcid!schultz From: schultz@cell.mot.COM (Rob Schultz) Newsgroups: comp.software-eng Subject: Re^2: Programmer productivity Message-ID: <459@carmine9.UUCP> Date: 30 Nov 89 20:33:56 GMT References: <16170@duke.cs.duke.edu> <34819@regenmeister.uucp> <16186@duke.cs.duke.edu> <31986@watmath.waterloo.edu> <16231@duke.cs.duke.edu> <9986@june.cs.washington.edu> Distribution: na Organization: Motorola Inc. - Cellular Infrastructure Div., Arlington Heights, IL 60004 Lines: 102 peterd@cs.washington.edu (Peter C. Damron) writes: > how does one use source lines of code (SLOC) >in any predictive way? Presumably, we are talking about estimating >how long a particular project will take. I understand that once you >know how many SLOC it will take for the project then you can predict >how long the project will take. But, how does one translate a >specification into some number of SLOC? This seems difficult to me. Maurice Halstead (_Elements_of_Software_Science_, Elsevier North-Holland, NY, 1977) worked on a set of textual software metrics. Now I don't claim to fully understand all of his work, but here's a shot at some of what he worked on. (Don't be phased by the math - once you understand it, it's not that bad!) We can measure the number of distinct operators (n1) we will use in the implementation of an algorithm. Similarly, we can measure the number of distinct operands (n2) we will use in that implementation. Given a correct design of the software component, we can also determine the total number of operators (N1) and operands (N2) that we will use in this implementation. Halstead went on to define the vocabulary of a software component to be n = n1 + n2 (that is, the number of distinct operators and operands used in the implementation of an algorithm), and its length to be N = N1 + N2 (or the total number of operators and operands used). He furthur postulated that N can be estimated by N' = n1 log2 n1 + n2 log2 n2. (The accuracy of this equation has been independently validated with confidence factors of .95+.) (Bear with me, this starts to get more interesting soon . . .) The volume of the program is then defined as V = N log2 n. This means that for each of the N elements of a program, log2 n bits are required to choose one of the operators or operands for that element. So, V is the number of bits required to specify a program. As you might suspect, V increases as we move from a higher level language to a lower level language. Or, V is inversely proportional to the level of abstraction L of the program. Halstead proposed a conservation law between L and V (LV = k). L is defined as the ratio of potential to actual volume, or L = V* / V where V* is the volume of the most compact, or highest level implementation of the algorithm. It follows that L will increase with n2, and decrease with both n1 and N2. So Halstead proposed a length estimator, L' = (2 / n1)(n2 / N2). Again, furthur research has indicated a correlation coefficient of .90, suggesting that L' / L is very nearly constant. Any given language will limit the level of even the tightest programs to some L < 1. So we now come up with a language level, lambda, where lambda = LV*. lambda measures the inherent limitation imposed by the language on the volume of a program. Since V* = LV, lambda = L^2 * V. Continuing on, we can easily see that the difficulty of programming increases as the volume of the program increases. Halstead proposed E = V / L as a measure of the mental effort required to create a program. This number is actually the number of mental discriminations, or decisions, that a fluent, concentrating programmer should make in implementing an algorithm. But where does all this get us in terms of estimating the amount of time required to implement this algorithm? Well, psychologist J M Stroud did some research into the speed with which we make decisions. His experiments led to the conclusion that a concentrating person is able to make between 5 and 20 mental discriminations per second, depending on the individual. Halstead's research showed that concentrating programmers tend to be able to work near the upper end of this range, at about 18 decisions per second. he called this number S, or the Stroud rate. So it follows from this that the time T in seconds for a fluent, concentrating programmer to implement an algorithm is T = E / S, or the number of decisions to be made divided by the speed at which those decisions can be made. Now I realise that this entire article has aboloutely nothing to do with SLOC, but we do now have a way to determine how long it should take a programmer to implement an algorithm, based on data that is easily obtainable from a good design. There is much additional research to be done here, including how long does it take to design an algorithm? How does all of this relate to SLOC? and many other unanswered questions. Incidentally, additional research has shown that the total number of errors in a program is directly related to E, the effort required to specify the program. Halstead proposed the following formula for the estimation of B: B = (E^2/3) / 3000. If correct, then for S = 18, a programmer commits one error every three minutes!! (Makes me glad I'm in testing and research, not in software development :-) >Just curious, >Peter. Curiosity is the only way to learn. I wonder why cats are so dumb :-) (No offense to cat lovers) >--------------- >Peter C. Damron >Dept. of Computer Science, FR-35 >University of Washington >Seattle, WA 98195 >peterd@cs.washington.edu >{ucbvax,decvax,etc.}!uw-beaver!uw-june!peterd -- Thanks - Rob Schultz, Motorola General Systems Group rms 1501 W Shure Dr, Arlington Heights, IL 60004 708 / 632 - 2757 schultz@mot.cell.COM !uunet!motcid!schultz "There is no innocence - only pain." (Usual disclaimers) Brought to you by Super Global Mega Corp .com