Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site unc.unc.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!decvax!mcnc!unc!biagioni From: biagioni@unc.UUCP (Edoardo Biagioni) Newsgroups: net.lang.mod2 Subject: (long) Modula-2 for Wirth's new machine Message-ID: <930@unc.unc.UUCP> Date: Mon, 3-Feb-86 22:07:25 EST Article-I.D.: unc.930 Posted: Mon Feb 3 22:07:25 1986 Date-Received: Wed, 5-Feb-86 04:48:45 EST Distribution: net Organization: CS Dept, U. of N. Carolina, Chapel Hill Lines: 268 We have just received the following description of the language supported by the Modula-2 compiler on the new machine developed by the group of N. Wirth at ETH. We thought it would be of interest to this newsgroup. The mailing has been reformatted, there have been no changes in the contents. Ed Biagioni decvax!mcnc!unc!biagioni seismo!mcnc!unc!biagioni Klaus Hinrichs decvax!mcnc!unc!hinrichs seismo!mcnc!unc!hinrichs ------------------------------------------------------------------ Modula-2 for Ceres N. Wirth 1. 1. 86 / 1. 2. 86 A Modula-2 compiler is now available for Ceres. The accepted language differs in some details from Modula for Lilith. This memo describes these differences, and it serves two purposes: First, it is intended for programmers who wish to port their software to Ceres. Second, it is a reminder for machine-independent programming and points out which points have to be observed, if new programs are to be easily transferrable either to Ceres or other machines. A summary of hints that should be of interest to all Modula programmers is given at the end. New Data Types The primary differences lie in the fact that Ceres is a 32-bit machine. This becomes manifest in the definition of some standard data types: BITSET = SET OF [0 .. 31] LONGINT integers in the range -2147483648 .. 2147483647 LONGREAL a new type representing real numbers by 64 bits All set and pointer types use 32 bits. These are extensions, and therefore do not require changes to existing programs, unless one wishes to profit from the new definitions. For example, routines operating on arrays of sets might be reprogrammed to save storage, since only half as many elements are needed, if effective use is made of their extended range. Constants of type LONGINT are integers with a suffix letter D (e.g. 1986D). Constants of type LONGREAL are distinguished by the use of the letter D in place of E in the scale factor (or simply a suffix D if the scale factor is missing). Examples: 1.0D, 37.82D-7. The new, standard function LONG converts an argument of type INTEGER or REAL to the types LONGINT or LONGREAL, and the function SHORT performs the inverse transformation (in the case of integers without range check). Also, the types INTEGER and CARDINAL are assignment compatible with LONGINT, REAL with LONGREAL, and LONGREAL with REAL. The two additional standard functions FLOATD and TRUNCD are analogous to FLOAT and TRUNC; they yield results of types LONGREAL and LONGINT respectively. Given the declarations below, the following correct assignments summarize these new facilities: i: INTEGER; k: LONGINT; x: REAL; z: LONGREAL; i := SHORT(k); i := TRUNC(x); i := TRUNC(z); k := i; k := LONG(i); k := TRUNCD(x); k := TRUNCD(z); x := z; x := SHORT(z); x := FLOAT(i); x := FLOAT(k); z := x; z := LONG(x); z := FLOATD(i); z := FLOATD(k); The Type CARDINAL A more subtle change concerns the type CARDINAL. The NS32000 processor supports unsigned arithmetic, but multiplication and division are cumbersome, requiring double registers. We must also recognize that through the availability of a type LONGINT the primary justification for the type CARDINAL, namely to enlarge the range of positive integers to cover all address values, has vanished. The second reason, namely to express that a variable assumes only natural numbers as values, has lost in attractivity, simply because the NS processor does not provide for convenient means to check against overflow. One is therefore tempted to abolish the type CARDINAL. However, our goal to make Lilith software easily portable to Ceres requires that Medos be available as an operating system. Medos makes heavy use of the type CARDINAL, and its elimination would require a very substantial rewrite of Medos. Our solution to this dilemma is to provide two compilers: The Standard Compiler treats the type CARDINAL as the subrange [0 .. 32767] with base type INTEGER. The implementation offers the standard range check for assignment. The welcome benefit of this solution is that the nasty incompatibility of the types INTEGER and CARDINAL in expressions disappears. (The result type of the functions ORD and HIGH is now INTEGER, and so is the base type of subranges, even if the lower bound is not negative). The Medos Compiler retains the type CARDINAL as on Lilith ([0 .. 65535]), but does not provide any checks against overflow or assignment of illegal, negative values of type INTEGER. We strongly recommend to adapt programs to the Standard Compiler, unless compelling reasons exist against it, and in particular to design new programs without the use of the type CARDINAL. The Medos Compiler will not be distributed outside, and we hope that it can be eliminated after some time. Type Conversions So far, conversion seems to pose no severe problems. In fact, genuine difficulties appear only where machine-dependent features of Modula were used. They are supposedly highlighted by imports from module SYSTEM, an import that everybody knows should only be taken as a last resort. Programmers who have imported from SYSTEM too generously are now receiving the bill. Another, much less obvious and therefore easily abused machine-dependence is the type transfer function. I strongly recommend to abstain from using type transfer functions; in fact, they are not accepted by the Standard Compiler (see below). Particularly frequent cases of their use are the packing of characters to and the unpacking from a word file: VAR n: CARDINAL; ch0, ch1: CHAR; n := 256*CARDINAL(ch0) + CARDINAL(ch1); WriteWord(wf, n) On Ceres, the transfer function CARDINAL is inapplicable to values of type CHAR. The use of ORD saves the situation. It becomes more difficult, if we also wish to eliminate the type CARDINAL. Simply replacing it by INTEGER creates two pitfalls. First, overflow may occur in multiplication. Second, n is then not compatible with ORD, when using the Lilith or the Ceres-Medos compiler. The suggested solution is to use a shift function imported from SYSTEM, and to make the machine-dependence explicitly visible: n := LSH(ORD(ch0),8) + ORD(ch1). ReadWord(wf, n); ch0 := CHAR(n DIV 256); ch1 := CHAR(n MOD 256) In this case, the transfer function CHAR will not be applicable, because variables of types CARDINAL and CHAR require different amounts of storage, whereas on Lilith they both use 1 word. Replacing CHAR by CHR solves that problem. Note, however, that it is customary to store the "right byte" first on files on Ceres, since byte numbering in words proceeds from right to left, like bit numbering. Hence, the two assignments should be interchanged. Here again, the use of an explicitly system dependent function is recommended: ch1 := CHR(LSH(n,-8)); ch0 := CHR(n). A much better solution, however, is to refrain from word files alltogether, and to treat all files as byte files. Integer Arithmetic Here we point out a discrepancy between the DIV and MOD operators on Lilith and Ceres, when applying them to negative operands. In fact, DIV is wrong on Lilith (and probably most other computers) for negative arguments, as it represents the zero-symmetric integer division, generally expressed by /. Therefore, the unpacking cannot be accomplished on Lilith with DIV/MOD, if the arguments are of type INTEGER. On Ceres, this is possible, and DIV is implemeted by a right shift, whenever the divisor is a power of 2. Lilith: -10 DIV 3 = -3 Ceres: -10 DIV 3 = -4 -10 MOD 3 not allowed -10 MOD 3 = 2 Module SYSTEM Finally, there are the explicitly system-dependent features imported from module SYSTEM, to which we also count the declaration of code procedures. On Lilith, the module SYSTEM contains the types ADDRESS and WORD, and the procedures ADR, TSIZE, and LONG. On Ceres, the type ADDRESS is not compatible with CARDINAL, but rather with LONGINT for address arithmetic. The new type BYTE represents the unit of addressable storage. The type WORD is eliminated from the Standard Compiler, but retained in the Ceres-Medos Compiler for obvious reasons. Programmers should aim at its elimination. On Ceres, the module SYSTEM contains a larger number of objects. This is a reflection of the fact that machine code cannot be defined by code procedures as on Lilith. The following definition is an attempt to summarize the facilities contained: DEFINITION MODULE SYSTEM; (*NW 7.12.85*) (*FOR NS32032. for details of instructions refer to manual*) TYPE ADDRESS; (*compatible with type LONGINT and with all pointer types*) TYPE BYTE; TYPE WORD; (*16 bit entity; in Medos Compiler only*) PROCEDURE ADR(VAR x: T): ADDRESS; (*address of variable x*) PROCEDURE TSIZE(T): INTEGER; (*size in bytes of variables of type T*) (*T subsequently denotes any type of size <= 4 bytes; T0 stands for either INTEGER or LONGINT*) PROCEDURE ASH(x: T; n: T0): T; (* x * 2^n *) PROCEDURE LSH(x: T; n: T0): T; (* x shifted by n positions*) PROCEDURE ROT(x: T; n: T0): T; (* x rotated by n positions*) PROCEDURE COM(x: T): T; (*binary complement of x*) PROCEDURE FFS(VAR x: T; VAR n: INTEGER): BOOLEAN; (*assign to n the position of the first one bit of x with position >= n; FFS = "a one-bit was found" *) PROCEDURE GET(a: LONGINT; VAR x: T); (*assign value at address a to x*) PROCEDURE PUT(a: LONGINT; x: T); (*assign x to storage at address a*) PROCEDURE MOVE(VAR src, dest: ARRAY OF BYTE; count: T0); PROCEDURE VAL(T; x: T1): T; END SYSTEM. The procedures GET and PUT are used to access device registers. The absolute addressing mode, i.e. variable declarations specifying an absolute address, is not available. All these facilities must be used only in few, low-level modules. The generic function procedure VAL(T, x) is effectively a replacement for type transfer functions T(x). Its value is x, interpreted as type T. No code is generated for this "procedure". Its function is to make the uses of machine-dependent type transfers more explicit and more readily locatable. Another source of potential problems is the inhomogeneous store on Lilith, manifest in the form of frames. Ceres offers a single, linear address space, and all programs making use of frames should be changed by eliminating frames. Modula on Ceres also offers a code procedure declaration. In contrast to Lilith-Modula, however, it is used in definition modules only and serves to introduce procedures implemented by supervisor calls. The code number n specifies the identification inserted as a byte after the SVC instruction. Evidently, such definitions are provided with the operating system used. The format is PROCEDURE P(parameter list) CODE n; On Lilith, procedures can be used as parameters, or can be assigned, only if they are declared on the global level. This restriction also holds for Modula on Ceres. Furthermore, they must be declared in a definition module, or their heading must be followed by an asterisk. PROCEDURE Assignable(parameters)*; ... This not very pleasing rule is necessary, because the NS processor uses different return instructions for external procedures and others, and they must correspond with the call instruction used. The compiler cannot determine the kind when generating a formal call. Hence we postulate the external mode for all formal and assigned procedures. Those defined in a definition module are automatically "external". Compiling Options The compiler optionally generates various redundancy checks. They can be enabled or disabled for each compilation by appending option characters to the source file name. The occurrence of an option character signals the inverse of its default value. xarray index bound checkdefault = on rsubrange assignment checkdefault = off varithmetic overflow checkdefault = off Example: SomeName.MOD/rv (all checks on) Some programming hints 1. Ceres uses byte addressing. However, data are transferred to and from memory in 32-bit words. Each type has an alignment factor k. Variables are aligned by the compiler to lie at an address a, such that a MOD k = 0. Since allocation is sequential, i.e. variables are allocated in the order of their textual occurrence, the least amount of storage gets waisted through alignment, if declarations are grouped according to size. The same holds for record fields, and in this case is even more important. The following are the sizes and alignment factors of types: Type Size Alignment Factor CHAR, BOOLEAN, enumerations, BYTE 1 1 INTEGER, CARDINAL, (WORD) 2 2 LONGINT, REAL, BITSET, sets, pointers, procedures 4 4 LONGREAL 8 4 arrays, records multiple of 4 4 2. The statements INC(n), DEC(n), INCL(s,n), EXCL(s,n) generate considerably denser code than their equivalents n := n+1, n := n-1, s := s + {n}, s := s - {n}, even in the case that n or s are simple variables. INC and DEC accept a single parameter only. 3. As on Lilith, access to so-called intermediate-level variables is slower and requires more code than access to local or global variables. Such accesses should therefore be made only after careful justification. Intermediate-level variables should be considered as implicit, additional procedure parameters. 4. Unlike Lilith, Ceres does not use indirect addressing for structured variables; all variables are allocated in sequence with ascending addresses. Frequently accessed variables should be placed at the beginning of the declaration list in order to obtain small addresses. This reduces the size of the code. Judging from my own experience in porting the compiler, there are usually more hidden machine-dependent features in a program than one is likely to assume, even in one's own concoctions. The event of porting a program is a good occasion to become aware of them and to eliminate (at least some of) them. I recommend to first write a version eliminating CARDINALs, word files, and type transfer functions, still operating on Lilith. The constructs that make use of features in which Lilith-Modula and Ceres-Modula genuinely differ can then be tackled in a separate, second step. The worst kind of machine-dependence, because it is so well hidden, is the use of an (untaged) variant record and its misuse by accessing a value as type T0 which was stored (to an overlaid field) as type T1. The only valid recommendation is to reprogram the algorithm. Some General Hints for Programming in Modula One of the main purposes of using a high-level language is to eliminate dependence on a particular implementation and computer. Even if only a single computer (type) is ever used, it is advisable to refrain from using machine-specific features of a language. The following are suggestions for programming on Lilith in general; they obtain additional relevance, if later on the use of Ceres is envisaged. 1. Refrain from using the type CARDINAL, unless use of values >= 32768 is relevant. 2. Refrain from the use of type transfer functions. 3. Refrain from the use of untagged variant records. Make sure that only fields of the variant indicated by the current tag field value are accessed. 4. Use byte (character) files rather than word files. 5. Make reference to modula SYSTEM only in carefully isolated places (modules).