Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!m.cs.uiuc.edu!marick
From: marick@m.cs.uiuc.edu
Newsgroups: comp.software-eng
Subject: Errors aren't that simple
Message-ID: <39400075@m.cs.uiuc.edu>
Date: 28 Feb 90 01:39:40 GMT
Lines: 59
Nf-ID: #N:m.cs.uiuc.edu:39400075:000:2730
Nf-From: m.cs.uiuc.edu!marick    Feb 27 13:29:00 1990


The flame fest about languages and errors has come up again.  Usually,
it gets nowhere.  Part of the reason is that people don't think about
what they mean by "error".  
======================

There are two assertions in Bill Wolfe's message:
1.  The C community releases an unacceptable number of errors.
2.  The C language is at least partly the *cause* of those errors.

The second point is disputed by the counterflamers, usually with
"Languages don't kill; people kill" arguments.

Both sides are off the track.

"Errors" is too broad a notion for this discussion.  Errors come in
different categories -- there are off-by-one errors, storage leaks,
wrong identifier used, etc. etc.  Different software products will
have different distributions of errors.  In at least some cases,
language used will be *correlated* with those distributions.  For
example, you don't see storage leaks in Lisp.

A better question than "does C cause more errors than Ada?" is "what
are the typical distributions of errors in projects using C and Ada?"
It may be that they're the same.  In that case, we can drop the issue
-- other factors are so much more important that argument is
pointless.  More likely (I suspect) is that there will be some
correlations.  C and Ada will be associated with different error
distributions, and enough "cross-cultural data" will bring these out,
in the same way that anthropologists *do* sometimes find out universal
facts about humankind.

This wouldn't end the C vs. Ada arguments, of course, since there are
still many other factors (productivity, cost of compilers, efficiency,
training and so on) to consider.  But it would add a tiny increment of
evidence to the argument.  Further, if we knew what errors were
correlated with a particular language, we could use extra-language
techniques to help prevent or detect those errors.

In a year and a half, I hope to have data on errors in Ada and C as a
part of my "error hackers" project.  I'll let you know.  References to
similar studies appreciated.


The first assertion (about communities and errors) may also be true.
Enough C/UNIX programmers have a similar enough way of doing things
that saying they're a community makes some sense.  But saying that
this community makes too many errors is shallow -- you have to ask
what else they make, and you have to ask what *kinds* of errors they
make.

I've been involved in enough process-improvement efforts to know that
it's not as simple as shoving a fire hose in one ear and pumping in
"software engineering concepts".  It's ever so easy to harm the good
while preventing the bad.  People will surprise you.

Brian Marick
Motorola @ University of Illinois
Email: marick@cs.uiuc.edu, uiucdcs!marick