Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!decwrl!netcomsv!jls
From: jls@netcom.COM (Jim Showalter)
Newsgroups: comp.software-eng
Subject: Re: use of metrics
Message-ID: <1991Jun15.001212.10271@netcom.COM>
Date: 15 Jun 91 00:12:12 GMT
References: <1991Jun12.003809.24084@netcom.COM> <35626@mimsy.umd.edu>
Organization: Netcom - Online Communication Services  UNIX System {408 241-9760 guest}
Lines: 136

cml@cs.umd.edu (Christopher Lott) writes:
>In article <1991Jun13.235937.24165@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
>>I came up with automatic ways to measure the
>>number of changes to a particular unit and to tie those changes to a particular
>>bug fix, ways to track the error-proneness of particular units, ways to
>>compute both static and dynamic metrics of various kinds, and so forth,
>>and all of it possible with off-the-shelf technology I've been working
>>with for the past four years

>I for one am extremely interested in hearing more about your successes,
>especially if this is as easy as you make it seem to be.

Without its preceding context, the phrase "I came up with" in the first
sentence above sounds as if I'm claiming to have designed and implemented
all of the stuff listed thereafter. This is NOT how it read in the original
post--in the original post the phrase "I came up with" meant "in response
to a challenge to produce a list of useful metrics that could be computed
automatically, I CAME UP WITH a list that contained the following".

That cleared up, allow me to now respond to your specific questions.
The tools I'm about to describe were invented and implemented by very
clever people working at a company called Rational, with which I was
formerly but am no longer associated. The tools all work, they are a
positive boon to anybody trying to engineer large complex systems,
and they're commercially available...

>Let me pose some questions, hopefully more meat than gristle ;-)

>1.  how you define a change?  does fixing two bugs in a single module
>    constitute 1 or 2 changes?  Is it every word?  Or an editing session?
>    if this is collected automatically, I suspect you're using editing
>    sessions, because keystrokes would be very difficult to interpret.
>    so what if I run 17 editing sessions of the same document, and wind
>    up changing only comments 16 of 17 times?  how do you measure granularity
>    of the changes?

The system available from Rational stores units as a time-dependent series
of line differentials. Each check-out/check-in cycle constitutes a generation:
all changes (stored as line differentials) during a generation can be reverted
or advanced to any designated generation. Reports of all such changes can be
easily obtained from the database, and make for a dandy audit trail for people
into that sort of thing and/or for the basis of computing metrics. Check-out/
check-in cycles are bound to work orders, so that the set of changes to the
set of units required to fix a particular bug can be kept together, automating
bookkeeping chores and putting an army of clerks out on the street (good for
the bottom line, bad for the clerks--such is the march of progress). If desired,
multiple work orders can be tied to changes to the same unit, for cases in which
a change fixes (or contributes to the fixing of) more than one bug.

This greatly increases productivity (by getting rid of all those clerks) while
greatly improving accuracy. It provides direct, quantitative visibility into
the software development process. It is non-invasive from the standpoint of
the programmer, since it is simply all integrated into the editors and
library managers (to check something out you basically hit the check-out
key).

>2.  how do you count errors?  are errors == changes?  are you familiar with
>    IEEE standard notation for bugs?  It is:
>	error == human misconception
>	fault == error as manifested in a document (possibly code)
>	failure == visible manifestation of a fault (runtime, etc)
>	(note that faults don't always cause failures)

>    so is an error really a fault?  Do change forms (do you collect forms?)
>    distinguish faults from enhancements?  Do you count failures?

The system just described is flexible enough that you could support any
classification scheme you desire. Work orders can be tailored to contain
user-defined fields--there is no reason such fields could not be the IEEE
scheme you describe. Or, alternatively, one could group work orders relating
to errors on one list of work orders (work orders belong to work order lists),
those for faults on another list, etc. The system was deliberately designed
to accomodate an arbitrary methodology, so that any site could tailor it
to their particular view of how things should be done.

>3.  what sort of static and dynamic metrics do you compute?  what is a dynamic
>    metric, runtime performance?  profile data?  did you write tools to compute
>    these metrics?  

Static metrics can be computed quite easily, since Rational represents programs
(specifically, Ada programs) using an underlying representation called DIANA:
basically a tree of sufficient richness to completely and unambiguously capture
all of the static syntactic and semantic information of the program. This is
not a particularly radical notion--the p-code system was a similar idea--and
yet, for some mysterious reason, Rational is IT when it comes to doing things
this way. Everybody else, for reasons that completely elude me, chooses to
rebuild this sort of information as in-memory data structures during the run-
time execution of the compiler and then THROW IT AWAY. Rational not only keeps
it around so the system can do things like intelligent recompilation (e.g.
allow the incremental addition/modification/deletion of portions of code
from both specs and bodies without requiring batch "big bang" recompilation
of all clients [transitively] as would be required on other systems),
but, furthermore, provides programmatic access to this information
so that a toolsmith wanting to analyze static metrics can traverse the DIANA
tree and extract whatever information is desired. In short, whereas on most
systems doing code analysis essentially involves writing the front-end for
a compiler, on Rational's equipment it involves writing some basically trivial
applications atop a significant, preexisting abstraction.

Consider a VERY
simple example: someone wants to write a SLOC counter. How does one usually
do this? By searching for semi-colons. But then, of course, this isn't really
accurate because semi-colons could be embedded in strings, so a little more
complexity is needed in the parser, etc, and pretty soon you've had to write
a brain-damaged subset of a scanner/tokenizer/parser. On Rational's system,
you simply traverse the DIANA tree and count up all the nodes that are of
kind Statement. It takes about 10 lines. Considerably more complicated analysis
is possible--it is possible, for example, to flag all exceptions that will
be propagated out of named scope (yes, this can be done statically). Try
doing THAT without DIANA support! Note that there is absolutely no reason
why this same approach could not be used for any other language, particularly
the more modern software engineering oriented ones that have a well-thought-out
structure (e.g. C++, Eiffel, Modula-X), and I am aware of at least one effort
to do this very thing for C++, but, in general, the state of practice is to
default to the lowest common denominator--parsing ASCII source files.

As for dynamic metrics, one uses, of course, performance and coverage analyzers.
There are also tools available that (using DIANA again) can analyze
a compilation unit and automatically generate for it a unit test.

>4.  do you see any interesting correlations between data from static analysis
>    and error(fault)(?)-pronenes of modules?

Depends considerably on which metrics one chooses to use. It has been my
experience that most static analysis performed by most tools is largely
syntactical, since without something like DIANA support that is the only
kind of analysis that is easy enough to do to consider doing. With full
access to all syntactic and semantic information, one can perform statie
analysis of things that have a much stronger correlation with the robustness
of the code (one can, for example, identify unhandled exceptions, exits
from functions without return values, etc).
-- 
*** LIMITLESS SOFTWARE, Inc: Jim Showalter, jls@netcom.com, (408) 243-0630 ****
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *