Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!ubvax!ardent!mac From: mac@mrk.ardent.com (Michael McNamara) Newsgroups: comp.arch Subject: Re: Double Width Integer Multiplication and Division Message-ID: Date: 5 Jul 89 17:21:57 GMT References: <1035@aber-cs.UUCP> <1370@l.cc.purdue.edu> <1333@sunset.MATH.UCLA.EDU> Sender: news@ardent.UUCP Organization: Ardent Computer Corporation, Sunnyvale, CA Lines: 212 In-reply-to: pmontgom@sonia.math.ucla.edu's message of 3 Jul 89 18:59:23 GMT [ An ongoing discussion of the lack of certain interesting math primitives in HLLs, unease of assembly programming risc chips, and the need for extensible HLLs ]. On the extensible HLL front, why not go to the authors of the extensible editor? the FSF's C compiler, GCC, has extended asm macro support which allow you to symbolicly hook up your asm routines to HLL variables. An excerpt from the gcc manual appears below. You can obtain gcc via ftp from a number of sites, as well as via uucp from osu-cis. Note this is a bit of a long posting, but, gcc is USEFUL to allow mortals to use the whole machine... > In article <1333@sunset.MATH.UCLA.EDU> pmontgom@sonia.math.ucla.edu > (Peter Montgomery) writes: > Given integers A, B, C where 0 <= A, B, < C, I want to be able to > find q, r such that A*B = q*C + r and 0 <= r < C. I do multiple-precision > arithmetic with large numbers, and this is such an important operation that > I cannot afford to call a subroutine every time I do it. > So rather than have just a small assembly routine to do this function, > I write the entire loop or the entire procedure in assembly code. > > I want to be able to define primitives like this in my language, > telling the compiler which sequence of instructions to generate whenever it > encounters my primitive (this sequence of instructions will be defined > ONCE, in the machine dependent part of my program, but the code > referencing the primitives will be scattered throughout). Many languages > allow one to define user primitives in terms of other language elements > (macros), but few languages allow us to go deeper and say things like (MC68020) > > "DEFINE QUOT_REM_64(arg1:unsigned long, register type D; > arg2:unsigned long, register type D; > arg3:unsigned long, register type D), > RETURNS (arg4:unsigned long, register type D; > arg5:unsigned long, register type D); > LOCAL upper: register type D; > LOCAL lower: register type D; > movl arg1, lower > mulul arg2, upper:lower /* 64-bit product arg1*arg2 */ > divul arg3, upper:lower /* Divide by arg3 */ > movl lower, arg4 /* quotient */ > movl upper, arg5 /* remainder */ > END QUOT_REM;" > > When the compiler subsequently encounters an expression like > (q, r) := QUOT_REM_64(A, B, C), the compiler would evaluate A, B, and C, > converting them to unsigned long if necessary. Each time these are > referenced in the body, the values would be moved to a D register and > the appropriate operation done. The outputs get assigned to q and r. > With a good optimizing compiler, the movl's could probably be eliminated > (and the compiler would be allowed to interchange arg1 and arg2 in the > mulul since the instruction is computationally commutative). The > programmer expresses his algorithm in terms of the available instructions, > while the compiler worries about the things it is good at (e.g., storage > and register allocation, common subexpression recognition, loop invariants). > The body of the definition would be allowed to reference more registers than > are available, with the compiler responsible for handling the overflow. > > Note the ASM primitive of C is unsatisfactory, for it forces > the programmer to know where the compiler has put the operands. I once > used FORTRAN statement functions on the Control Data 7600 to do > double-length integer multiplies (the same hardware instruction was > used for the upper half of floating and integer multiplications, and > I was able to tell the compiler to treat my original operands > as floating point without changing the bit-pattern), but nowhere > else have I succeeded. > -------- > Peter Montgomery > pmontgom@MATH.UCLA.EDU From the GCC info node, Extended Asm Support: Assembler Instructions with C Expression Operands ================================================= In an assembler instruction using `asm', you can now specify the operands of the instruction using C expressions. This means no more guessing which registers or memory locations will contain the data you want to use. You must specify an assembler instruction template much like what appears in a machine description, plus an operand constraint string for each operand. For example, here is how to use the 68881's `fsinx' instruction: asm ("fsinx %1,%0" : "=f" (result) : "f" (angle)); Here `angle' is the C expression for the input operand while `result' is that of the output operand. Each has `"f"' as its operand constraint, saying that a floating-point register is required. The constraints use the same language used in the machine description (*Note Constraints::). Each operand is described by an operand-constraint string followed by the C expression in parentheses. A colon separates the assembler template from the first output operand, and another separates the last output operand from the first input, if any. Commas separate output operands and separate inputs. The number of operands is limited to the maximum number of operands in any instruction pattern in the machine description. Output operand expressions must be lvalues; the compiler can check this. The input operands need not be lvalues. The compiler cannot check whether the operands have data types that are reasonable for the instruction being executed. It does not parse the assembler instruction template and does not know what it means, or whether it is valid assembler input. The extended `asm' feature is most often used for machine instructions that the compiler itself does not know exist. If there are no output operands, and there are input operands, then you should write two colons in a row where the output operands would go. The output operands must be write-only; GNU CC will assume that the values in these operands before the instruction are dead and need not be generated. For an operand that is read-write, or in which not all bits are written and the other bits contain useful information, you must logically split its function into two separate operands, one input operand and one write-only output operand. The connection between them is expressed by constraints which say they need to be in the same location when the instruction executes. You can use the same C expression for both operands, or different expressions. For example, here we write the (fictitious) `combine' instruction with `bar' as its read-only source operand and `foo' as its read-write destination: asm ("combine %2,%0" : "=r" (foo) : "0" (foo), "g" (bar)); The constraint `"0"' for operand 1 says that it must occupy the same location as operand 0. Only a digit in the constraint can guarantee that one operand will be in the same place as another. The mere fact that `foo' is the value of both operands is not enough to guarantee that they will be in the same place in the generated assembler code. The following would not work: asm ("combine %2,%0" : "=r" (foo) : "r" (foo), "g" (bar)); Various optimizations or reloading could cause operands 0 and 1 to be in different registers; GNU CC knows no reason not to do so. For example, the compiler might find a copy of the value of `foo' in one register and use it for operand 1, but generate the output operand 0 in a different register (copying it afterward to `foo''s own address). Of course, since the register for operand 1 is not even mentioned in the assembler code, the result will not work, but GNU CC can't tell that. Unless an output operand has the `&' constraint modifier, GNU CC may allocate it in the same register as an unrelated input operand, on the assumption that the inputs are consumed before the outputs are produced. This assumption may be false if the assembler code actually consists of more than one instruction. In such a case, use `&' for each output operand that may not overlap an input. *Note Modifiers::. Some instructions clobber specific hard registers. To describe this, write a third colon after the input operands, followed by the names of the clobbered hard registers (given as strings). For example, on the vax, asm volatile ("movc3 %0,%1,%2" : /* no outputs */ : "g" (from), "g" (to), "g" (count) : "r0", "r1", "r2", "r3", "r4", "r5"); Usually the most convenient way to use these `asm' instructions is to encapsulate them in macros that look like functions. For example, #define sin(x) \ ({ double __value, __arg = (x); \ asm ("fsinx %1,%0": "=f" (__value): "f" (__arg)); \ __value; }) Here the variable `__arg' is used to make sure that the instruction operates on a proper `double' value, and to accept only those arguments `x' which can convert automatically to a `double'. Another way to make sure the instruction operates on the correct data type is to use a cast in the `asm'. This is different from using a variable `__arg' in that it converts more different types. For example, if the desired type were `int', casting the argument to `int' would accept a pointer with no complaint, while assigning the argument to an `int' variable named `__arg' would warn about using a pointer unless the caller explicitly casts it. GNU CC assumes for optimization purposes that these instructions have no side effects except to change the output operands. This does not mean that instructions with a side effect cannot be used, but you must be careful, because the compiler may eliminate them if the output operands aren't used, or move them out of loops, or replace two with one if they constitute a common subexpression. Also, if your instruction does have a side effect on a variable that otherwise appears not to change, the old value of the variable may be reused later if it happens to be found in a register. You can prevent an `asm' instruction from being deleted, moved or combined by writing the keyword `volatile' after the `asm'. For example: #define set_priority(x) \ asm volatile ("set_priority %0": /* no outputs */ : "g" (x)) It is a natural idea to look for a way to give access to the condition code left by the assembler instruction. However, when we attempted to implement this, we found no way to make it work reliably. The problem is that output operands might need reloading, which would result in additional following "store" instructions. On most machines, these instructions would alter the condition code before there was time to test it. This problem doesn't arise for ordinary "test" and "compare" instructions because they don't have any output operands. -- _________________ Michael McNamara mac@ardent.com