Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!dog.ee.lbl.gov!elf.ee.lbl.gov!torek From: torek@elf.ee.lbl.gov (Chris Torek) Newsgroups: comp.lang.c Subject: Re: One more point regarding = and == (more flamage) Message-ID: <11563@dog.ee.lbl.gov> Date: 28 Mar 91 17:45:38 GMT References: <925@isgtec.UUCP> <1991Mar26.184245.3538@chinet.chi.il.us> Reply-To: torek@elf.ee.lbl.gov (Chris Torek) Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 133 X-Local-Date: Thu, 28 Mar 91 09:45:38 PST [`Hey Rocky, watch me pull just one more point out of my hat' `That trick NEVER works!' `This time for sure'] >>> a) while (*foo++ = *bar++) >>> b) while (*foo ++ == *bar++) >>> c) while ((*foo++ = *bar++) != 0) >In article <925@isgtec.UUCP> robert@isgtec.UUCP writes: >>Well the biggest argument has been if you use a) the maintainer can't tell >>if you meant a) or b); if you use c) the maintainer KNOWS you meant a). >>This isn't rubbish. As we all know by now, I happen to agree with this sentiment, but much more so when applied to `if'; `while' errors of this sort are less common. The following `just one more point' explains why. In article <1991Mar26.184245.3538@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes: >If you assume that the programmer didn't make a mistake (i.e. typed >what he was thinking), then a) is just as obvious as c). If you >assume that he did make a mistake, then c) is probably more likely >to be wrong that a). More characters = more chances to screw up. This would be true but for the fact that coding is done by *people*. Human error rate is a `jittery' function. Although a number of studies have shown remarkable consistency in the error rate measured as `number of errors found divided by number of source lines', it is also the case that people use more care with `complicated' constructs. That is, people are more likely to leave an uncorrected error behind when typing The quick brown fox jumps over the lazy dog than when typing 2.718281828459045235360 I spent more time checking the above expansion of `e' than I did typing this entire sentence. Note also that, in addition to the fact that error rate is not a monotonic function of `number of characters typed', error studies typically find different `kinds' of errors. One important kind of error is the `typo' (typographical error) (and this one really *is* a function of the number and placement of characters typed). Typographial errors take three forms: transpositions (`The quick bronw fox jumps over hte lazy dog') insertions (`The quiick brown fox jumpsd over the lazy dog') deletions (`The quick brown fox umps over the lazy dog') Typographical errors are, if not the most common form of error, certainly in the top contenders. Keeping these in mind, let us consider C code. After one becomes familiar with C, constructs like if ((c = getchar()) != EOF) become `natural' and one does not think twice when writing them. In many languages (not just C, although C is rare in its partcular spelling) constructs like if (a == b) are also `natural' and again one does not think twice. Now, most errors can be caught before they happen, just by thinking twice. So if people found if (a == b) unfamiliar, they would check again and possibly discover that they had, by mistake, typed in if (a = b) ---but `if (a == b)' is too familiar to bother rechecking, and such typos go unnoticed. Thus, when I (as a software maintainer) find if (a = b) I must consider this a `red flag' signifying a possible error, while if ((a = b) != 0) is quite unlikely to be a typo. On the other hand, while loops of the form while (*a++ == *b++) are considerably more rare. It is therefore more likely that whoever wrote while (*a++ = *b++) really intended the assignment. Still, deletions are a common form of typographical error; perhaps the single `=' is a mistake anyway. If the assignment is intended, while ((*a++ = *b++) != 0) is a clear flag that `there is no deletion typo here'. If the latter is what was meant but while ((*a++ == *b++) != 0) actually appears, this acts as another flag: it is unusal for people to use the result of a comparsion in anything but a `direct boolean' context (if, while, &&, etc.). In other words, it all comes down to these facts: * Embedded assign-and-test is common enough not to get rechecked. * Typographic errors of deletion and of doubling (`quiick') are very common. Combining these leads to the two mistakes below: if (a = b) /* oops */ foo(); while (n < lim) n == f(n); /* oops */ both of which draw warnings from many compilers. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov