Xref: utzoo comp.software-eng:2993 comp.lang.c:26457 comp.lang.misc:4305
Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think!mintaka!mit-eddie!apollo!perry
From: perry@apollo.HP.COM (Jim Perry)
Newsgroups: comp.software-eng,comp.lang.c,comp.lang.misc
Subject: Re: problems/risks due to programming language, stories requested
Message-ID: <48f0d9c2.20b6d@apollo.HP.COM>
Date: 1 Mar 90 22:19:00 GMT
References: <6960@internal.Apple.COM> <1990Feb28.213543.21748@sun.soe.clarkson.edu> <31039@brunix.UUCP>
Sender: root@apollo.HP.COM
Reply-To: perry@apollo.HP.COM (Jim Perry)
Organization: Hewlett-Packard Company, Apollo Division; Chelmsford, MA
Lines: 128

Peter H. Golde writes:
>However, every programmer, no matter how good, makes stupid mistakes -- ones
>in which s/he knows better, but for some reason, s/he did anyway.  These
>might be simple syntax errors, or left out statements, etc.  The higher the
>percentage of these error which the compiler catches, the more reliable
>the program will be and the less time it will take to be debugged.  This
>is why redundancy in a language can be a good thing.  

A good place to jump in with a tale from life --

On the subject of language and errors, it so happens that I just
completed initial development of a smallish (6000 line) program, of
some complexity, on an accelerated schedule.  Since it's fresh in my
mind and I have ready access to the source history, I thought I'd look
over the bugs I found and see how they broke down.  I tried to be
honest and count all bugs; I don't include changes that were the
result of interface semantics misunderstandings (correctly
implementing the wrong thing), or bugs/deficiencies that I happened to
turn up in existing code.  These are my bugs: mistakes, some stupid,
that I made.  I don't include compiler-detected errors/warnings, and I
used an ANSI compiler (with full prototypes; argument mismatch is a
huge bug source in pre-ANSI C, and the compiler detected quite a few
such errors), and a compiler that does a reasonable job of catching
"warning" situations (including nested comments, for instance -- it
found several of those).

I program in C for a living, and haven't written in another language
in 3 years.  C is not my native programming language, so I speak C
with a PL/I-ish accent (several of the bugs I turned up involved
pointer arithmetic/arrays).  I have done systems programming in XPL,
PL/I, Pascal, BASIC, various assemblers, and a smattering of others. 
I have never written in Ada.

In no particular order:

1. A function had an output parameter which was a numeric count, i.e.
a pointer to an integer.  I wrote the code to increment the count as

    *count++;

which of course does entirely the wrong thing (it should be
"(*count)++;" or, as I rewrote it, "*count += 1;").  Clearly this
particular mistake is strictly limited to C: in another language this
parameter would be a reference/out/var, not a pointer; the ++ and
thus the ambiguity of what's incremented is obviously unique to C; and
of course the stupid notion of unused-expression-as-statement is also
uncommon.  However, a better C compiler could have flagged the fact of
the unused expression, i.e. that while "count++" was presumably an
intended side effect, "*count" was unused.

2. I wanted to fill in a record whose structure was something like:

    struct {
        struct a fixed_length_stuff;
        struct b variable_length_array[fixed_length_stuff.size];
        char     string[]; /* variable-length null-terminated */
        struct c more_stuff;
    } foo;

Clearly this isn't expressible in this format in C, so the code to
fill in the record used pointer arithmetic to develop pointers to the
start of the variable length array, to the string, and to the
following data.  In one case I made a simple omission of one addition:
[this is approximate; assume all the right casting]

    ptr = &foo + sizeof(a);                 /* pointer to b */ 
    strcpy(ptr + (size*sizeof(b)), name);   /* put name after b */
    ptr += strlen(name)+1;      /* step past name to more_stuff */
    fill_in_more_stuff(ptr);    /* and put values there */

The first two lines work, and were copied from an instance where
more_stuff wasn't of interest; the error is that the + in the line 2
should logically be changed to a +=, or as I prefer, the line should
be expanded to 

    ptr += (size*sizeof(b));    /* step past b to string */
    strcpy(ptr, name);          /* fill in name */ 

This was just an error, and would have been so in any language where I
was trying to represent such a structure extralingually.  Other
languages, however, would have allowed me to describe such a structure
within the language (PL/I, for one). Where possible, I let compilers
do my address computation for me.  There were a couple of more bugs
along similar lines, coming up with a pointer to entries in such
contrived records; I'll count all together.

3. Due to an interface confusion an off-by-one situation arose where a
function filled in one more entry of an array than the caller had
allocated, thus trashing the next thing on the stack.  In a system
with runtime array bounds checking, this would have been detected
quickly and painlessly.  As C doesn't really have arrays it's very
unlikely that a C runtime implementation could do this.  

4. A function to allocate, initialize, and return a new node to go in
a linked list had two bugs: I neglected to set the link field to NULL,
and there was no return statement.  In most languages the former would
have been a bug, although if I'd been using a language where I could
define initial values for newly created storage (C++, PL/I...) I would
probably have done it there.  The absence of a return statement could
and should have been caught by the compiler.

5. I neglected to maintain a node count field when nodes were added to
or removed from a list.  Just a dumb oversight.

6. In a few instances I didn't set function output parameters
correctly in cases of exceptions (errors) -- i.e. not setting the
"number of objects returned" variable to 0.  This matters because I'm
writing distributed code and such variables get used by the RPC
mechanism to determine how much stuff to send across the wire on
return.  Not a C issue.

7. I had one bug caused by omission of an item in an initializer list
for a struct (a vector of function pointers).  The compiler could have
caught that if the language didn't allow partial initializer lists.

8. In one spot in one algorithm I used a break when I needed a
continue (I didn't confuse the two, I got the algorithm wrong).

Overall, that's 8 bugs or classes of bugs. 5 of the 8 could have been
avoided or detected by a smarter compiler or a different language. 
I've tried to cover everything (i.e. I've been through the audit trail
of edits).  These were the show stoppers, there may be subtler bugs
lurking.

-
Jim Perry   perry@apollo.hp.com    HP/Apollo, Chelmsford MA
This particularly rapid unintelligible patter 
isn't generally heard and if it is it doesn't matter.