Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!ames!killer!pollux!ti-csl!pf@csc.ti.com From: pf@csc.ti.com (Paul Fuqua) Newsgroups: comp.sources.d Subject: Obscure Not-Quite-Bug in Compress Message-ID: <51610@ti-csl.CSNET> Date: 15 Jun 88 18:56:08 GMT Sender: news@ti-csl.CSNET Organization: TI Computer Science Center, Dallas, Texas Lines: 32 I recently translated compress to Common Lisp to run it on my Lisp machine (I don't deal with versionless filesystems any more than I have to). Along the way I discovered a bit of code that is, strictly speaking, a bug, but which C doesn't catch or seem to care about. In the function getcode, there is a local variable bp that is a pointer into the array buf, which is 16 8-bit-bytes long (for 16-bit compress). There is some hairy code that is used when not on a Vax (which does most of the work in one instruction). When decompressing, and the codes are 16 bits, and getcode is grabbing the last code in buf, *bp starts off pointing at buf[14], is bumped to buf[15] by a *bp++, then is bumped to buf[16] by another *bp++. At this point is the following fragment: /* high order bits. */ code |= (*bp & rmask[bits]) << r_off; *bp reads the word just off the end of buf, but rmask[bits] is always 0 in this case, so the word is unimportant and everything works. This bit caused me trouble because Common Lisp bounds-checks array references (and lispms tend to crash when referencing unallocated memory). My correction was to check for rmask[bits] == 0 before doing the above, so bp wouldn't reference off the end. C, of course, doesn't bounds-check, especially not when using pointers, and this bit of code has been happily running on countless machines. Is this a bug or a feature? I have my own opinion. pf Paul Fuqua Texas Instruments Computer Science Center, Dallas, Texas CSNet: pf@csc.ti.com (ARPA too, sometimes) UUCP: {smu, texsun, im4u, rice}!ti-csl!pf