Xref: utzoo comp.software-eng:2993 comp.lang.c:26457 comp.lang.misc:4305 Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think!mintaka!mit-eddie!apollo!perry From: perry@apollo.HP.COM (Jim Perry) Newsgroups: comp.software-eng,comp.lang.c,comp.lang.misc Subject: Re: problems/risks due to programming language, stories requested Message-ID: <48f0d9c2.20b6d@apollo.HP.COM> Date: 1 Mar 90 22:19:00 GMT References: <6960@internal.Apple.COM> <1990Feb28.213543.21748@sun.soe.clarkson.edu> <31039@brunix.UUCP> Sender: root@apollo.HP.COM Reply-To: perry@apollo.HP.COM (Jim Perry) Organization: Hewlett-Packard Company, Apollo Division; Chelmsford, MA Lines: 128 Peter H. Golde writes: >However, every programmer, no matter how good, makes stupid mistakes -- ones >in which s/he knows better, but for some reason, s/he did anyway. These >might be simple syntax errors, or left out statements, etc. The higher the >percentage of these error which the compiler catches, the more reliable >the program will be and the less time it will take to be debugged. This >is why redundancy in a language can be a good thing. A good place to jump in with a tale from life -- On the subject of language and errors, it so happens that I just completed initial development of a smallish (6000 line) program, of some complexity, on an accelerated schedule. Since it's fresh in my mind and I have ready access to the source history, I thought I'd look over the bugs I found and see how they broke down. I tried to be honest and count all bugs; I don't include changes that were the result of interface semantics misunderstandings (correctly implementing the wrong thing), or bugs/deficiencies that I happened to turn up in existing code. These are my bugs: mistakes, some stupid, that I made. I don't include compiler-detected errors/warnings, and I used an ANSI compiler (with full prototypes; argument mismatch is a huge bug source in pre-ANSI C, and the compiler detected quite a few such errors), and a compiler that does a reasonable job of catching "warning" situations (including nested comments, for instance -- it found several of those). I program in C for a living, and haven't written in another language in 3 years. C is not my native programming language, so I speak C with a PL/I-ish accent (several of the bugs I turned up involved pointer arithmetic/arrays). I have done systems programming in XPL, PL/I, Pascal, BASIC, various assemblers, and a smattering of others. I have never written in Ada. In no particular order: 1. A function had an output parameter which was a numeric count, i.e. a pointer to an integer. I wrote the code to increment the count as *count++; which of course does entirely the wrong thing (it should be "(*count)++;" or, as I rewrote it, "*count += 1;"). Clearly this particular mistake is strictly limited to C: in another language this parameter would be a reference/out/var, not a pointer; the ++ and thus the ambiguity of what's incremented is obviously unique to C; and of course the stupid notion of unused-expression-as-statement is also uncommon. However, a better C compiler could have flagged the fact of the unused expression, i.e. that while "count++" was presumably an intended side effect, "*count" was unused. 2. I wanted to fill in a record whose structure was something like: struct { struct a fixed_length_stuff; struct b variable_length_array[fixed_length_stuff.size]; char string[]; /* variable-length null-terminated */ struct c more_stuff; } foo; Clearly this isn't expressible in this format in C, so the code to fill in the record used pointer arithmetic to develop pointers to the start of the variable length array, to the string, and to the following data. In one case I made a simple omission of one addition: [this is approximate; assume all the right casting] ptr = &foo + sizeof(a); /* pointer to b */ strcpy(ptr + (size*sizeof(b)), name); /* put name after b */ ptr += strlen(name)+1; /* step past name to more_stuff */ fill_in_more_stuff(ptr); /* and put values there */ The first two lines work, and were copied from an instance where more_stuff wasn't of interest; the error is that the + in the line 2 should logically be changed to a +=, or as I prefer, the line should be expanded to ptr += (size*sizeof(b)); /* step past b to string */ strcpy(ptr, name); /* fill in name */ This was just an error, and would have been so in any language where I was trying to represent such a structure extralingually. Other languages, however, would have allowed me to describe such a structure within the language (PL/I, for one). Where possible, I let compilers do my address computation for me. There were a couple of more bugs along similar lines, coming up with a pointer to entries in such contrived records; I'll count all together. 3. Due to an interface confusion an off-by-one situation arose where a function filled in one more entry of an array than the caller had allocated, thus trashing the next thing on the stack. In a system with runtime array bounds checking, this would have been detected quickly and painlessly. As C doesn't really have arrays it's very unlikely that a C runtime implementation could do this. 4. A function to allocate, initialize, and return a new node to go in a linked list had two bugs: I neglected to set the link field to NULL, and there was no return statement. In most languages the former would have been a bug, although if I'd been using a language where I could define initial values for newly created storage (C++, PL/I...) I would probably have done it there. The absence of a return statement could and should have been caught by the compiler. 5. I neglected to maintain a node count field when nodes were added to or removed from a list. Just a dumb oversight. 6. In a few instances I didn't set function output parameters correctly in cases of exceptions (errors) -- i.e. not setting the "number of objects returned" variable to 0. This matters because I'm writing distributed code and such variables get used by the RPC mechanism to determine how much stuff to send across the wire on return. Not a C issue. 7. I had one bug caused by omission of an item in an initializer list for a struct (a vector of function pointers). The compiler could have caught that if the language didn't allow partial initializer lists. 8. In one spot in one algorithm I used a break when I needed a continue (I didn't confuse the two, I got the algorithm wrong). Overall, that's 8 bugs or classes of bugs. 5 of the 8 could have been avoided or detected by a smarter compiler or a different language. I've tried to cover everything (i.e. I've been through the audit trail of edits). These were the show stoppers, there may be subtler bugs lurking. - Jim Perry perry@apollo.hp.com HP/Apollo, Chelmsford MA This particularly rapid unintelligible patter isn't generally heard and if it is it doesn't matter.