Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!world!burley From: burley@world.std.com (James C Burley) Newsgroups: comp.lang.c Subject: Re: Assignment in test: OK? Message-ID: Date: 21 Sep 90 14:43:59 GMT References: <1990Sep12.194753.9808@laguna.ccsf.caltech.edu> <4641:Sep1919:49:5990@kramden.acf.nyu.edu> <4089@rtifs1.UUCP> Sender: burley@world.std.com (James C Burley) Organization: The World Lines: 90 In-Reply-To: trt@rti.rti.org's message of 20 Sep 90 14:25:11 GMT trt@rti.rti.org (Thomas Truscott) writes: APPENDIX: A MORE SERIOUS SYNTAX PROBLEM THAN =/== Let's look at a serious "flaw" in C syntax that no one complains about. Why not? Because compiler diagnostics keep problems from occurring! In C, to index a two dimensional array one uses n = x[i][j]; This is "gratuitously" different than other languages. Look at how horrible the following plausible C mistake is: n = x[i,j]; Oh no, the dreaded comma operator! The effect is identical to: n = x[j]; And C compilers will gladly generate executable code for it. What a mess it would be tracking down the bug. Oooh, another good example, and one I ran into when I first started writing C code. I was writing a text-to-Postscript converter for VMS and I had set up an array to specify the XY offset of a particular graphic so I could fine-tune it later. It looked something like this: int adjust[4] = [ 1 4 6 3 ]; I had just typed in these semi-random values based on my initial guesses for the adjustments. After several compile/link/test passes, I had the code working and decided it was time to focus on the actual adjustment needed, so I pulled out a ruler, determined the actual number of points I wanted for the adjustment rectangle, and modified the numbers accordingly: int adjust[4] = [ 1 -3 2 -1 ]; I tried this, and strangely, the graphic moved too far. Remeasure, readjust: int adjust[4] = [ 1 2 1 -1 ]; Hmmm, moved too far again. Finally, after a couple more attempts, I decided I had to interactively debug it because I assumed my offsetting code was mysteriously wrong. Had to learn VMS debugger etc. Everything looks ok but STILL doesn't work -- even though the code paths work fine. Finally I look at the actual values for the offset rect as they get pulled out of the structure, and notice that the last one or two are zero! Then I look at the earlier ones and they're not quite right -- though close. Turns out, as most readers know by know I expect, that I had not used commas to separate the items in the initialization list. When there were four distinct positive numbers, the compiler just inserted commas gratis, with no message. When a couple of the initializers were changed to negative values, the four values turned into two expressions with two operands each! So [ 1 -3 2 -1 ] would become [ -2 1 ] (with two zeros added to compensate, again automatically, but I believe this is correct C and hence shouldn't be complained about). Ultimately, even a "complaining" compiler (such as one that complains about seeing foo[i,j] because i is an unused value) isn't going to catch all such errors in all situations -- for example, foo[func(i),j] probably shouldn't produce an error -- perhaps it was intentional, or was it? But in my case I think I finally checked the standard and found that it did supposedly require commas between expressions in an initialization list. For a compiler to insert them for one is -- well, arggghhh! (-: In doing other language/little-language/whatever designs, I've often thought back to this problem and avoided making the following general mistake whenever possible: A syntax that allows an item, call it X, in a particular context, should always allow another item, Y, that is closely related and likely to directly replace it during an edit session and yet is not syntactically identical to X. (E.g. 4 is syntactically identical to 5 but not to -4; yet they are all closely related in most user's minds, as might be 4 and the expression 2+2.) If replacement of X with Y would change the meaning or validity of the overall construct, strongly lean towards disallowing X in that context if there can be a form that provides for both X and Y meaning the same thing -- in other words, don't allow X in a supposedly "convenient" form for "most cases" if it is at all likely someone will change X to a Y and get a different result. (Lean more if the result is valid but different than would be expected by direct replacement; lean less if the result is an invalid construct resulting in a message). So the compiler I was using violated the rule by permitting one to omit commas between initializers without considering what happens if one of those values is directly changed to a negative value. Other example: c=4;, change the 4 to -2, and you get c=-2;, which in older Cs used to mean c-=2; (they took a better and more general approach to fixing it than my limited suggestion -- given no alternative, my rule would have suggested requiring at least one space following the =). James Craig Burley, Software Craftsperson burley@world.std.com