Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!usc!snorkelwacker!paperboy!meissner
From: meissner@osf.org (Michael Meissner)
Newsgroups: comp.sys.m88k
Subject: Re: Fixed point multiply overflow detection
Message-ID: <MEISSNER.90Aug10151834@osf.osf.org>
Date: 10 Aug 90 19:18:34 GMT
References: <12425@encore.Encore.COM> <12439@encore.Encore.COM>
Sender: news@OSF.ORG
Organization: Open Software Foundation
Lines: 59
In-reply-to: jkenton@pinocchio.encore.com's message of 9 Aug 90 19:23:41 GMT

In article <12439@encore.Encore.COM> jkenton@pinocchio.encore.com
(Jeff Kenton) writes:

| From article <12425@encore.Encore.COM>, I wrote:
| > 
| > The hardware doesn't provide any direct way (as you've noticed).  Two choices
| > come to mind:
| > 
| > 	* use floating point (double precision for enough accuracy)
| > 	  and check the results.
| > 
| > 	* use ff1 instructions to find the magnitude of both operands
| > 	  to see if overflow can occur.
| > 
| 
| Two more things come to mind on this subject:
| 
| 	* If you use "ff1" instructions to see how big your arguments
| 	  are, you discover that there is an indeterminate case where
| 	  the answer is at least 2^31 but might be > 2^32.  You can't
| 	  tell for sure without doing as much work as the multiply.
| 
| 	* Depending on the details of your problem, there may be a
| 	  way to condition your values so that overflow can't occur.

Another thing that comes to mind is that the multiply on the 88k is
pipelined (unlike say the MIPS).  This means you can actually have 4
mulitplies going on at the same time.  Thus you could restructure the
problem to do 4 16 bit multiplies, and recombine the results.

It's been six months since I programmed on an 88k, but if you have x
in r2, and y in r3, an unsigned dmul would look like:

dmul:
	extu	r4,r2,16<16>		; top half of r2
	extu	r5,r2,16<0>		; bottom half of r2
	extu	r6,r3,16<16>		; top half of r3
	extu	r7,r3,16<0>		; bottom half of r3
	mul	r8,r4,r7		; calc middle bits#1
	mul	r9,r5,r6		; calc middle bits#2
	mul	r3,r5,r7		; calc bottom bits
	mul	r2,r4,r6		; calc top bits
					; wait for 2nd mul
	add	r10,r8,r9		; combine middle bits
	extu	r11,r10,16<16>		; bits to add to r2
	mak	r12,r10,16<16>		; bits to add to r3
	addu.co	r3,r3,r12		; add middle bits to lower
	br.n	r1			; return to caller
	addu.ci	r2,r2,r11		; add middle bits to upper + carry

Thus if r2 != 0, the result can't fit in an unsigned 32-bit int.  I
believe for signed dmul's the test would be extract bit 31 of r3 with
an ext instruction, and it should match r2 (ie, either 0 or -1).

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?