Path: utzoo!utgpu!utstat!jarvis.csri.toronto.edu!mailrus!ames!oliveb!Ozona!chase From: chase@Ozona.orc.olivetti.com (David Chase) Newsgroups: comp.arch Subject: Re: String lengths Summary: Firth is right, Ritchie misremembered, and this has nothing to do with computer architecture Message-ID: <37529@oliveb.olivetti.com> Date: 8 Feb 89 19:26:03 GMT References: <8876@alice.uucp> <8442@aw.sei.cmu.edu> <88850@sun.uucp> <88853@sun.uucp> Sender: news@oliveb.olivetti.com Reply-To: chase@Ozona.UUCP (David Chase) Organization: Olivetti Research Center, Menlo Park, CA Lines: 85 In article <88853@sun.uucp> khb@sun.UUCP (Keith Bierman - Sun Tactical Engineering) writes: In article <8876@alice.UUCP> dmr@alice.UUCP ["PI" below] writes: >>$> >The history of this convention and of the general array scheme had little >>$> >to do with the PDP-11; it was inherited from BCPL and B. ["bliff" below] >>$In article <8442@aw.sei.cmu.edu>, firth@bd.sei.cmu.edu (Robert Firth) ["Mr Poster" below] writes: >>$> A correction here: the C scheme was NOT inherited from BCPL. ["boff" below] >The question is (to paraphrase) "What did the inventors of C think >about?" The Principal Inventor sez "bliff". Mr Poster sez "no it was boff". > >I do not think it fair to characterize "boff" as a valid hypothesis, >unless the PI had died, and left no notes, or ambigous ones. Since the >PI is very much alive, and has spoken, contradicting him is a bit out >of _my_ definition of scientific inquiry. Sigh. Nonetheless, he (Dennis Ritchie) probably made a mistake in his posting. Other Prinicipal Inventors are still alive and also left unambiguous notes. Also, I'd suggest that Robert Firth knows BCPL better than Dennis Ritchie. (I'd suggest that *I* know BCPL better than Dennis Ritchie, too -- I've used it within the last 4 years.) I'll give references. From _BCPL -- The language and its compiler_ by Martin Richards and Colin Whhitby-Strevens, 1979 ---------------- [PACKSTRING and UNPACKSTRING] "After unpacking your string, you will discover that the first word contains a count of the number of characters in the string proper, which starts at the second word. As an example, we give the library routines WRITES, UNPACKSTRING, and PACKSTRING: LET PACKSTRING(V,S) = VALOF $( LET N = V!0 & #XFF // extract least significant 8 bytes LET SIZE = N / BYTESPERWORD S!SIZE := 0 // pack out last word with zeroes FOR I = 0 TO N DO PUTBYTE(S,I, V!I) RESULTIS SIZE $) ---------------- Note, too, that the zeros in the last word will only appear in those cases where the bytes packed do not fill out the words in the string (that is, consider packing a string containing 3 characters). From "The Portability of the BCPL Compiler" by Martin Richards in _Software -- Practice and Experience_, volume 1, pp 135-146, 1971. ---------------- Strings are packed in BCPL and the packing is necessarily machine dependent since it depends strongly on the word and byte sizes of the object machine. The usual internal representation of a string value is as a pointer to the first of a set of words holding the length and packed characters of the string. The zeroth byte is usually justified to the start of a word and holds the length of the string with successive bytes holding the characters and padded with zeros (or possibly spaces) at the end to fill the last word. In order to handle strings in as machine independent way [sic] as possible packing, unpacking and writing of strings is done using library routines which are defined in the machine dependent interface with the operating system. ---------------- I think it is fair to say that C did NOT inherit its string representation from BCPL. I wish that some of you people would check your facts before posting. Linguistic comparisons belong elsewhere, so I won't make them. As far as implementation goes, I think it is a mixed bag. Many operations are "faster" on strings with counts, but if your maximum count is only 255 then everything is pretty fast whether it is counted or terminated. You should also check out the string operations on the 360/370 sort of machines; BCPL was running there (rather well) a very long time ago. I think that those operations worked on at most 256 characters (and, I should add, NOT on 0-length strings). It may well be another case of architecture influencing language design (note that a zero-length BCPL string actually contains one byte -- the zero count). David