Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!mips!pacbell.com!att!cbnewsm!lfd
From: lfd@cbnewsm.att.com (Lee Derbenwick)
Newsgroups: comp.software-eng
Subject: Re: Personal growth and software engineering!
Summary: What do you do when you can't measure what you want (need) to?
Message-ID: <1991Apr15.221119.6242@cbnewsm.att.com>
Date: 15 Apr 91 22:11:19 GMT
References: <9233@castle.ed.ac.uk> <1991Mar25.164133.29674@unislc.uucp> <581@tivoli.UUCP>
Organization: AT&T Bell Laboratories
Lines: 80

In article <581@tivoli.UUCP>, alan@tivoli.UUCP (Alan R. Weiss) writes:
> In article <1991Apr8.163111.3968@cbnewsm.att.com> lfd@cbnewsm.att.com (Lee Derbenwick) writes:
> >In article <549@tivoli.UUCP>, alan@tivoli.UUCP (Alan R. Weiss) writes:
> >
> >> If the metrics are bogus, then fix them by including the workers
> >> in the process.  In this case, "process" can mean calling a short
> >> meeting, identifying dumb metrics, and coming up with meaningful ones.
> >
> >And how are the workers to know how to measure the process?  We may be
> >able to _reject_ certain measures as bogus, but that is much easier
> >than creating good measures, which is an open research area.
> 
> While I believe that this *IS* an open area of research, as you say,
> I TRULY believe that people-who-are-doing-the-work ("workers")
> MUST be involved in setting their own standards.  How are they to
> know?  Management outlines the broad parameters of the problem
> and/or improvement goals, and then provides leadership.  [ ... ]

This begs the question.  Of course the workers must be involved in
setting the standards.  But, right now, we know of _no_ metrics that
measure some of what we really want to measure.  No amount of
management leadership is going to allow anyone to create those
metrics.  But enough management pressure will force people to create
bogus ones.

> >Here are a couple of quality metrics I would _like_: they seem to capture
> >two key areas of software quality -- faults and maintainability.  Both,
> >unfortunately, violate causality:
> >
> >1. Total number, severity, and time-to-discovery of remaining faults that
> >   will be experienced by customers as software failures.
> >
> >2. Cost of introducing the next several enhancements that will be required
> >   of this code.
> 
> The first one should be "mom and apple pie" for all software orgs.
> The second one is basic risk analysis, and its good, too.

I'm afraid Alan missed the tense in #1.  It is (reasonably) easy to
measure what _did_ happen: but that tells you what your software
process was doing one or two years ago.  And if you haven't changed
your process at all in that time, you haven't been paying attention
even to qualitative information.  To improve your process _now_, you
need to know what faults your customers will experience as future
failures.  So it's far from "Mom and Apple Pie."

(In cases where you can accurately characterize your customers' usage
patterns, John Musa's work on software reliability gives you ways of
estimating MTBF for customers, which is close to what I want.  But
with new or significantly changed software, you often have no better
than guesses about the usage.)

The second metric also requires prediction of the future.  Measures
of structure, etc., can be useful at a crude level, but they can't
capture "What is the chance that the customer will decide they
desperately need something that violates a fundamental assumption of
this code, so that it will have to be totally rewritten?"  And I don't
know of any good way to measure fundamental assumptions.  Again, you
can find out later whether you had assumed too much -- but that tells
you about your process in the past.  "Basic risk analysis" isn't even
close, though it _is_ a first step in that direction.

There are metrics that tell you about your process right now: but in
most cases, they are very crude approximations, or they are based on
assumptions that are little more than guesswork.  If you change your
process based on them, you may be as likely to worsen as to improve it.

On the other hand, there are metrics to tell you very accurately about
your process at some time in the past.  Since it takes customers time
to find errors, these tends to be at least 6 months and easily two years
or more out of date.  But you can make significant process changes
within a few months.  In terms of control theory, you have a process
with a certain time constant, but your observations of that process have
a significantly longer time constant.  By attempting to use those
observations to control the process, you are likely to introduce
oscillatory or chaotic behavior.

 -- Speaking strictly for myself,
 --   Lee Derbenwick, AT&T Bell Laboratories, Warren, NJ
 --   lfd@cbnewsm.ATT.COM  or  <wherever>!att!cbnewsm!lfd