Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!decwrl!netcomsv!jls From: jls@netcom.COM (Jim Showalter) Newsgroups: comp.software-eng Subject: Re: use of metrics Message-ID: <1991Jun15.001212.10271@netcom.COM> Date: 15 Jun 91 00:12:12 GMT References: <1991Jun12.003809.24084@netcom.COM> <35626@mimsy.umd.edu> Organization: Netcom - Online Communication Services UNIX System {408 241-9760 guest} Lines: 136 cml@cs.umd.edu (Christopher Lott) writes: >In article <1991Jun13.235937.24165@netcom.COM> jls@netcom.COM (Jim Showalter) writes: >>I came up with automatic ways to measure the >>number of changes to a particular unit and to tie those changes to a particular >>bug fix, ways to track the error-proneness of particular units, ways to >>compute both static and dynamic metrics of various kinds, and so forth, >>and all of it possible with off-the-shelf technology I've been working >>with for the past four years >I for one am extremely interested in hearing more about your successes, >especially if this is as easy as you make it seem to be. Without its preceding context, the phrase "I came up with" in the first sentence above sounds as if I'm claiming to have designed and implemented all of the stuff listed thereafter. This is NOT how it read in the original post--in the original post the phrase "I came up with" meant "in response to a challenge to produce a list of useful metrics that could be computed automatically, I CAME UP WITH a list that contained the following". That cleared up, allow me to now respond to your specific questions. The tools I'm about to describe were invented and implemented by very clever people working at a company called Rational, with which I was formerly but am no longer associated. The tools all work, they are a positive boon to anybody trying to engineer large complex systems, and they're commercially available... >Let me pose some questions, hopefully more meat than gristle ;-) >1. how you define a change? does fixing two bugs in a single module > constitute 1 or 2 changes? Is it every word? Or an editing session? > if this is collected automatically, I suspect you're using editing > sessions, because keystrokes would be very difficult to interpret. > so what if I run 17 editing sessions of the same document, and wind > up changing only comments 16 of 17 times? how do you measure granularity > of the changes? The system available from Rational stores units as a time-dependent series of line differentials. Each check-out/check-in cycle constitutes a generation: all changes (stored as line differentials) during a generation can be reverted or advanced to any designated generation. Reports of all such changes can be easily obtained from the database, and make for a dandy audit trail for people into that sort of thing and/or for the basis of computing metrics. Check-out/ check-in cycles are bound to work orders, so that the set of changes to the set of units required to fix a particular bug can be kept together, automating bookkeeping chores and putting an army of clerks out on the street (good for the bottom line, bad for the clerks--such is the march of progress). If desired, multiple work orders can be tied to changes to the same unit, for cases in which a change fixes (or contributes to the fixing of) more than one bug. This greatly increases productivity (by getting rid of all those clerks) while greatly improving accuracy. It provides direct, quantitative visibility into the software development process. It is non-invasive from the standpoint of the programmer, since it is simply all integrated into the editors and library managers (to check something out you basically hit the check-out key). >2. how do you count errors? are errors == changes? are you familiar with > IEEE standard notation for bugs? It is: > error == human misconception > fault == error as manifested in a document (possibly code) > failure == visible manifestation of a fault (runtime, etc) > (note that faults don't always cause failures) > so is an error really a fault? Do change forms (do you collect forms?) > distinguish faults from enhancements? Do you count failures? The system just described is flexible enough that you could support any classification scheme you desire. Work orders can be tailored to contain user-defined fields--there is no reason such fields could not be the IEEE scheme you describe. Or, alternatively, one could group work orders relating to errors on one list of work orders (work orders belong to work order lists), those for faults on another list, etc. The system was deliberately designed to accomodate an arbitrary methodology, so that any site could tailor it to their particular view of how things should be done. >3. what sort of static and dynamic metrics do you compute? what is a dynamic > metric, runtime performance? profile data? did you write tools to compute > these metrics? Static metrics can be computed quite easily, since Rational represents programs (specifically, Ada programs) using an underlying representation called DIANA: basically a tree of sufficient richness to completely and unambiguously capture all of the static syntactic and semantic information of the program. This is not a particularly radical notion--the p-code system was a similar idea--and yet, for some mysterious reason, Rational is IT when it comes to doing things this way. Everybody else, for reasons that completely elude me, chooses to rebuild this sort of information as in-memory data structures during the run- time execution of the compiler and then THROW IT AWAY. Rational not only keeps it around so the system can do things like intelligent recompilation (e.g. allow the incremental addition/modification/deletion of portions of code from both specs and bodies without requiring batch "big bang" recompilation of all clients [transitively] as would be required on other systems), but, furthermore, provides programmatic access to this information so that a toolsmith wanting to analyze static metrics can traverse the DIANA tree and extract whatever information is desired. In short, whereas on most systems doing code analysis essentially involves writing the front-end for a compiler, on Rational's equipment it involves writing some basically trivial applications atop a significant, preexisting abstraction. Consider a VERY simple example: someone wants to write a SLOC counter. How does one usually do this? By searching for semi-colons. But then, of course, this isn't really accurate because semi-colons could be embedded in strings, so a little more complexity is needed in the parser, etc, and pretty soon you've had to write a brain-damaged subset of a scanner/tokenizer/parser. On Rational's system, you simply traverse the DIANA tree and count up all the nodes that are of kind Statement. It takes about 10 lines. Considerably more complicated analysis is possible--it is possible, for example, to flag all exceptions that will be propagated out of named scope (yes, this can be done statically). Try doing THAT without DIANA support! Note that there is absolutely no reason why this same approach could not be used for any other language, particularly the more modern software engineering oriented ones that have a well-thought-out structure (e.g. C++, Eiffel, Modula-X), and I am aware of at least one effort to do this very thing for C++, but, in general, the state of practice is to default to the lowest common denominator--parsing ASCII source files. As for dynamic metrics, one uses, of course, performance and coverage analyzers. There are also tools available that (using DIANA again) can analyze a compilation unit and automatically generate for it a unit test. >4. do you see any interesting correlations between data from static analysis > and error(fault)(?)-pronenes of modules? Depends considerably on which metrics one chooses to use. It has been my experience that most static analysis performed by most tools is largely syntactical, since without something like DIANA support that is the only kind of analysis that is easy enough to do to consider doing. With full access to all syntactic and semantic information, one can perform statie analysis of things that have a much stronger correlation with the robustness of the code (one can, for example, identify unhandled exceptions, exits from functions without return values, etc). -- *** LIMITLESS SOFTWARE, Inc: Jim Showalter, jls@netcom.com, (408) 243-0630 **** *Proven solutions to software problems. Consulting and training on all aspects* *of software development. Management/process/methodology. Architecture/design/* *reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++. *