Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!usc!elroy.jpl.nasa.gov!swrinde!ucsd!dog.ee.lbl.gov!elf.ee.lbl.gov!torek
From: torek@elf.ee.lbl.gov (Chris Torek)
Newsgroups: comp.arch
Subject: Re: bizarre instructions
Message-ID: <10244@dog.ee.lbl.gov>
Date: 25 Feb 91 19:27:07 GMT
References: <9102220245.AA14853@ucbvax.Berkeley.EDU> <1991Feb25.134714.23523@linus.mitre.org>
Reply-To: torek@elf.ee.lbl.gov (Chris Torek)
Organization: Lawrence Berkeley Laboratory, Berkeley
Lines: 118
X-Local-Date: Mon, 25 Feb 91 11:27:08 PST

In article <1991Feb25.134714.23523@linus.mitre.org> bs@gauss.mitre.org
(Robert D. Silverman) writes:
>In article <9102220245.AA14853@ucbvax.Berkeley.EDU> JBS@IBM.COM writes:
>... what one usually wants is (A*B + C)/D and (A*B + C) mod D.
>Even on machines that support double length integer multiplies, one
>cannot put the above operations into HLL because the compiler will not
>generate the double length multiply (say 32 x 32 --> 64) nor will it
>then do the (64 /32 --> 32 bit quotient & remainder). Since A*B can overflow
>32 bits one is FORCED to call assembler routines to do this.

Ah yes.  Clearly the following does not work....

	/*
	 * return quotient and remainder from (a*b + c) divrem d
	 */
	#if 0
	/*
	 * This is the way we would like to do it, but gcc emits one extra
	 * instruction, as it is not smart enough to completly eliminate the
	 * addressing on r (it uses a register for r, rather than a pointer,
	 * but never quite goes all the way).
	 */
	static __inline int divrem(int a, int b, int c, int d, int *r) {
		int q;
		double tmp;	/* force reg pair allocation */

		asm("emul %1,%2,%3,%0" : "=g"(tmp) : "g"(a), "g"(b), "g"(c));
		asm("ediv %3,%2,%0,%1" : "=g"(q), "=g"(*r) : "r"(tmp), "g"(d));

		return q;
	}
	#else
	/*
	 * So instead we will use rather peculiar gcc syntax.
	 * Note that the macro uses a, b, c, d, q, and r exactly once each,
	 * and thus side effects (*p++, etc.) are safe.
	 */
	#define divrem(q, r, a, b, c, d) ({ \
		double divrem_tmp; \
		asm("emul %1,%2,%3,%0" : "=g"(divrem_tmp) : \
		    "g"(a), "g"(b), "g"(c)); \
		asm("ediv %3,%2,%0,%1" : "=g"(q), "=g"(r) : \
		    "r"(divrem_tmp), "g"(d)); \
	})
	#endif

	int a[100], b[100], c[100], d[100];
	int q[100], r[100];

	void doit(int n) {
		int i;

		for (i = 0; i < n; i++) {
	#if 0
			q[i] = divrem(a[i], b[i], c[i], d[i], &r[i]);
	#else
			divrem(q[i], r[i], a[i], b[i], c[i], d[i]);
	#endif
		}
	}

But wait!  Maybe, just *maybe*, we should try it out before dismissing it.

Well goll-ee, it seems to work!

When compiled on a Tahoe (the Tahoe is a `RISC'---a `Reused Instruction
Set Computer'; its emul and ediv are just like those on the VAX) with
`gcc -O -S' this compiles to (compiler comments and other drek stripped):

	_doit:
		.word 0x3c0
		movl 4(fp),r4
		clrl r2
		cmpl r2,r4
		jgeq L13
		movab _a,r9
		movab _b,r8
		movab _c,r7
		movab _q,r6
		movab _r,r5
		movab _d,r3
	L12:
		emul (r9)[r2],(r8)[r2],(r7)[r2],r0
		ediv (r3)[r2],r0,(r6)[r2],(r5)[r2]
		incl r2
		cmpl r2,r4
		jlss L12
	L13:
		ret

(Note that the Tahoe does not have auto-increment addressing modes,
and this is in fact the best that can be done.)

On the VAX the loop changes to (gcc 1.37.1, -fstrength-reduce -mgnu):

	L12:
		emul (r2)+,(r3)+,(r4)+,r0	# the registers
		ediv (r5),r0,r1,r0		# are allocated in
		movl r1,(r6)+			# a different order.
		movl r0,(r7)+
		addl2 $4,r5
		jaoblss r9,r8,L12

Apparently the machine-dependent part has not been taught to combine
`ediv' properly; it should be:

	L12:
		emul (r2)+,(r3)+,(r4)+,r0
		ediv (r5)+,r0,(r6)+,(r7)+
		jaoblss r9,r8,L12

A bit of work on vax.md should fix it.

This has its drawbacks: the syntax is distinctly un-pretty, and it
requires gcc, and it is machine-dependent.  It does, however, work.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov


Brought to you by Super Global Mega Corp .com