Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!lll-tis!elxsi!beatnix!rw From: rw@beatnix.UUCP (Russell Williams) Newsgroups: comp.arch Subject: Re: Null-terminated C strings Message-ID: <650@elxsi.UUCP> Date: 23 Dec 87 21:42:11 GMT References: <261@ivory.SanDiego.NCR.COM> <164@sdeggo.UUCP> <174@quick.COM> <14116@think.UUCP> <178@imagine.PAWL.RPI.EDU> Sender: nobody@elxsi.UUCP Reply-To: rw@beatnix.UUCP (Russell Williams) Organization: ELXSI Super Computers, San Jose Lines: 38 In article <178@imagine.PAWL.RPI.EDU> You cant get here from there. writes: >In article <.....> people write: >Lots of stuff about strings with null terminators and lengths and dope vectors >and.... > >There are a lot of ways to implement strings in various ways and their >benefits and costs. > >Another point that has been made, and should be made again is that C >is powerful enough to construct the other methods where needed. > >One of the virtues of C is that much is made explicit and as simple as >possible and there is very little behind-the-scenes magic. The same >may be said for Unix (tm), and (perhaps more germainly to this >newsgroup) RISC machines. I would like to urge designers of systems, The only drawback is that with Unix, the tools have been built; with RISC machines, the compilers do it for you. With C, there is no standard package written to handle arbitrary-content strings, so everybody uses the built in null terminated strings, and thus most programs fail on files or data with embedded nulls, and many fail on files with lines longer than MAXLINE. Sometimes this isn't a problem. With Emacs, for example, it's a serious pain. Further, the C convention is to take advantage of the knowledge of the representation so there's no easy way to change programs to a different string representation. Perhaps if there were a package in the standard library, people would use it when appropriate. With C++ and its inline functions, you could construct compatible libraries to handle things both ways. Our O/S (EMBOS) is written in Pascal extended to include dope vectored strings. Except for very low-level routines (such as memory manager) which can't handle having unexpected heap allocations happen, there have been very few drawbacks or cases where we had to use another method of character handling; in high-level code (editors, batch schedulers, terminal drivers), the fact that strange magic may happen behind your back has proven irrelevant. Russell Williams ..{ucbvax!sun,lll-lcc!lll-tis,altos86}!elxsi!rw