Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!columbia!rutgers!nike!ucbcad!ucbvax!sdcsvax!sdcc6!sdcc18!ss60f From: ss60f@sdcc18.ucsd.EDU (ss60f) Newsgroups: net.lang.mod2 Subject: M2 Working group Message-ID: <562@sdcc18.ucsd.EDU> Date: Sat, 1-Nov-86 03:13:33 EST Article-I.D.: sdcc18.562 Posted: Sat Nov 1 03:13:33 1986 Date-Received: Mon, 3-Nov-86 21:08:48 EST Reply-To: ss60f@sdcc18.UUCP (ss60f) Organization: U.C. San Diego, Academic Computer Center Lines: 88 As a long-time UCSD Pascal programmer and (more recently) a Modula-2 enthusiast, I have some comments about the extensions to Modula-2 proposed by the Modula-2 Working Group of the British Standards Institution. I am especially concerned about the string-handling issue. I strongly favor clearing up some of the ambiguous parts of the language definition, but otherwise leaving it alone. The other alternative, introducing a string type similar to UCSD Pascal's, would be (to my mind) a great mistake. UCSD Pascal has some good features, but string handling is not one of them. Specifically, it has the following drawbacks (most of these also apply to Turbo Pascal; I don't know if extended ISO Pascal is similar): 1. String lengths are limited to 255 chars. (this could be ameliorated by allowing 2 bytes for the string length). 2. It is not possible to append to a string by simply writing charac- ters past its current end (because the length value would not be updated). Instead, the CONCAT procedure must be used. 3. More generally, code that copies a portion of one string to another is difficult to write and inefficient. Example: suppose one wants to copy a single English word from STRING1 (starting at index I) to STRING2. There are two ways to do it in UCSD Pascal: a. search for the end of the word, record the number of characters away from I that it ends, and then pass this count as a parameter to the COPY procedure (awkward and error-prone because it emphasizes counting rather than copying, and relatively inefficient because the word is read twice, once to count, once to copy). b. read chars. one at a time from STRING1, and use CONCAT to progressively append them to STRING2 (very inefficient; involves a function call for every char). It both easier and more efficient to use 0-terminated strings (i.e., ARRAYs OF CHAR). For example, in Modula-2 this might be done as follows: J := 0; LOOP IF (STRING1[I] = EOS) OR (STRING1[I] = " ") THEN EXIT; ELSE STRING2[J] := STRING1[I]; END; INC(J); INC(I); END; STRING2[J] := EOS; (* EOS = 0C *) This is straightforward to write, easy to understand, and involves no function call overhead. 4. Portability across languages is another reason for avoiding a special string type. The Modula-2 code fragment given above can be easily translated to or from c or ANSI standard Pascal. As I know from experience, it can be very difficult to translate programs using UCSD-like string types and functions to languages lacking those types and functions (including other dialects of Pascal). Most languages have arrays, though, so treating strings as much as possible like ordinary arrays is a good way to limit portability problems. Modula-2 should encourage this. As a final point, Kernighan and Plauger's book 'Software Tools in Pascal' details methods for string handling that are a far better model for emulation than UCSD Pascal's. They were able to provide a uniform set of functions that could be implemented on many different machines and compilers, and were reasonably efficient and convenient to use. I have used Pascal functions and macros similar to K & P's for years, and recently created similar functions in Modula-2 for string handling. These do everything that I find necessary, and I was able to implement them in standard Modula-2, without any extensions. Disclaimer: the opinions expressed here are my own and are not necessarily those of my employer. Also, for the record, while I do work for UCSD, I have never been connected with the UCSD Pascal project, except as a user of their products. -- Jon Dart Dept. of Anthropology UCSD C-001 La Jolla, CA 92093 ss60f@sdcc18.UUCP ss60f%sdcc18@sdcsvax.UCSD.EDU