Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!cs.utexas.edu!uunet!mcvax!ukc!icdoc!dcw From: dcw@doc.ic.ac.uk (Duncan C White) Newsgroups: comp.lang.modula2 Subject: Size of sets (was: Re: type conversion question) Summary: VERY LONG ARTICLE, Pascal stuff included Message-ID: <912@gould.doc.ic.ac.uk> Date: 23 Jun 89 18:44:02 GMT References: <14226@watdragon.waterloo.edu< <7465@xenna.Encore.COM> <6187@pdn.paradyne.com> <985@maestro.htsa.aha.nl> <6197@pdn.paradyne.com> Reply-To: dcw@doc.ic.ac.uk (Duncan C White) Organization: Dept. of Computing, Imperial College, London, UK. Lines: 162 In article <6197@pdn.paradyne.com> alan@oz.paradyne.com (Alan Lovejoy) writes: >Frans van Otten writes: ><= ><=A set type defined as SET OF T .. (where T has) at most N values, ><=where N is a small constant determined by the implementation, usually ><=the computer's wordsize or a small multiple thereof. ><= >.... So the fact that a >compiler supports SET OF CHAR does not make it nonstandard, because 256 >is a small multiple of 32, namely 8. And also because the maximum size >is ultimately TO BE DETERMINED BY THE IMPLEMENTATION. Compilers which >implement small maximum set sizes do so solely at the whim of the implementors. >They are not forced to do this in order to conform to the standard. What you say, Alan, is correct, but IMHO misses the point. As you say, PIM clearly leaves the choice up to the implementor. That is, set size in Modula-2 is strictly defined as being implementation- dependent. (I always like standards double-speak :-) However, this does not ONLY say "it's implementation defined", it says "the minimum (and only) guaranteed range is the machine wordlength". Furthermore, the whole emphasis of his book is on word-sized sets: In Chapter 18 "Set Types", he says: ".. implementations of Modula set a limit to the number of elements (in a set) That limit is often the wordlength of the computer or a small multiple of it. This is quite a small number, say 16 or 32. "Although these rules restrict the generality of the set concept, set types are a powerful tool and allow to express operations <> of an operand on a high level of abstraction based on a familiar and intuitively appealing mathematical concept." <<>>: My italics, not his :-) It seems very clear to me that he knows that ONLY word-sized sets "restrict the generality of the set concept" but he is only concerned with putting bit-twiddling on a higher plane!!! [Aside: the rationale for this is clear: Modula-2 was designed as a systems programming language, and was used by the ETH group to do bit-twiddling far more than general-purpose programming in which big sets would be useful. Hence BITSETs, module SYSTEM, processes and interrupt handlers] The consequence of this pervasive bias is that most Modula-2 compilers, including all (?) the versions that have come from ETH, only implement word-sized sets. Now, when we compare this situation with Pascal, we find an interesting situation: In the Pascal User Manual and Report (2nd edition) in Chapter 8 : Set Types (page 50) we find: "Implementations of PASCAL may define limits for the size of sets, which can be quite small (eg. the number bits in a word)" At first glance, this might seem to blow my argument away, as Wirth SAYS THE SAME THING about Pascal as he did about Modula-2. However, my argument is rather more subtle (it would be :-), is about the overall tone of the book. Further down page 50, we find: "Examples of set constructors: [13] [i+j,i-j] ['A'..'Z', '0'..'9']" And on page 51: "examples of declarations -and- assignment chset1 : set of 'a'..'z'; chset1 := ['d', 'a', 'g'];" And again, "Set operations are relatively fast, and can be used to eliminate more complicated tests. A simpler test for: if (ch='a') or (ch='b') or (ch='c') or (ch='d') or (ch='z') ... is: if ch in ['a'..'d', 'z'] ..." He then follows this with an example of what to do when you really need a set bigger than whatever maximum size your compiler provides: a Sieve of Eratosthenes to find primes in range 1..10000. He says: "Most implementations would therefore not accept a set with 10000 elements." (!!) I hope it is clear from these quotes that Wirth was envisaging that character-subrange sets (broadly alphanumeric ones) would very probably be supported, whereas sets with several thousand elements would not be supported. [Aside: One should recall, of course, that he was using large CDC6000s then, which had 59 (!) bits in a word, and 64 characters..almost all characters had ordinal values < wordlength] Many of his examples use the idiom: if ch in ['0'..'9'] then The result of the emphasis in the Pascal Report, then, is that compiler writers, while knowing that they were ALLOWED to restrict sets to one machine word, knew that they were EXPECTED to provide an environment in which the example programs would work! So, SET OF CHAR (or failing that the alphanumeric subrange !) was typically allowed, whereas sets with say, a couple of thousand elements were not. Returning to Modula-2, then, I contend that the whole emphasis of the discussions of sets in PIM is designed to give the impression that ONLY word-sized sets must be supported. Note: whether the restriction is Num_elements < wordlength or ORD(lower) < wordlength & ORD(higher) < wordlength is ALSO implementation defined, so not even the set constant {'0'..'9'} is guaranteed! Exit the Pascal idiom :-( In practice, this emphasis means that I, and any other user who is interested in writing programs that are PORTABLE (as well as standard-conforming) cannot safely use SET OF CHAR, or even SET OF ['0'..'9']. The blame for this can only be laid at one door: someone who designs languages for individual projects, sacrificing any generality not needed for that project, then releases them to the world at large, releases new versions of the definition document (PIM) for such reasons as "heh, we've got a new font, so we re-typeset the book" (see preface to 4th edition), and who is clearly incapable of ensuring consistency between the definitive grammar and the railroad diagrams, much less with the grammar snippets and examples given in the book! [jeez... I hope the BSI standard Modula-2 will be better than this... equally, it surely couldn't be worse (could it...) ] Duncan >Alan Lovejoy; alan@pdn; 813-530-2211; AT&T Paradyne: 8550 Ulmerton, Largo, FL. >Disclaimer: I do not speak for AT&T Paradyne. They do not speak for me. >_____________________________Down with Li Peng!________________________________ >Motto: If nanomachines will be able to reconstruct you, YOU AREN'T DEAD YET. [ Reply to: dcw@doc.ic.ac.uk or ...!ukc!icdoc!dcw ] ---------------------------------------------------------------------------- Duncan White, | "In the case of Mr. Lawson, I can't say Dept. Of Computing, | his years as Chancellor have given him Imperial College, | an understanding of the economy, but he can London SW7 | probably find his way around the building" England. | Ian Aitken, Question Time