Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!usc!snorkelwacker!paperboy!meissner From: meissner@osf.org (Michael Meissner) Newsgroups: comp.sys.m88k Subject: Re: Fixed point multiply overflow detection Message-ID: Date: 10 Aug 90 19:18:34 GMT References: <12425@encore.Encore.COM> <12439@encore.Encore.COM> Sender: news@OSF.ORG Organization: Open Software Foundation Lines: 59 In-reply-to: jkenton@pinocchio.encore.com's message of 9 Aug 90 19:23:41 GMT In article <12439@encore.Encore.COM> jkenton@pinocchio.encore.com (Jeff Kenton) writes: | From article <12425@encore.Encore.COM>, I wrote: | > | > The hardware doesn't provide any direct way (as you've noticed). Two choices | > come to mind: | > | > * use floating point (double precision for enough accuracy) | > and check the results. | > | > * use ff1 instructions to find the magnitude of both operands | > to see if overflow can occur. | > | | Two more things come to mind on this subject: | | * If you use "ff1" instructions to see how big your arguments | are, you discover that there is an indeterminate case where | the answer is at least 2^31 but might be > 2^32. You can't | tell for sure without doing as much work as the multiply. | | * Depending on the details of your problem, there may be a | way to condition your values so that overflow can't occur. Another thing that comes to mind is that the multiply on the 88k is pipelined (unlike say the MIPS). This means you can actually have 4 mulitplies going on at the same time. Thus you could restructure the problem to do 4 16 bit multiplies, and recombine the results. It's been six months since I programmed on an 88k, but if you have x in r2, and y in r3, an unsigned dmul would look like: dmul: extu r4,r2,16<16> ; top half of r2 extu r5,r2,16<0> ; bottom half of r2 extu r6,r3,16<16> ; top half of r3 extu r7,r3,16<0> ; bottom half of r3 mul r8,r4,r7 ; calc middle bits#1 mul r9,r5,r6 ; calc middle bits#2 mul r3,r5,r7 ; calc bottom bits mul r2,r4,r6 ; calc top bits ; wait for 2nd mul add r10,r8,r9 ; combine middle bits extu r11,r10,16<16> ; bits to add to r2 mak r12,r10,16<16> ; bits to add to r3 addu.co r3,r3,r12 ; add middle bits to lower br.n r1 ; return to caller addu.ci r2,r2,r11 ; add middle bits to upper + carry Thus if r2 != 0, the result can't fit in an unsigned 32-bit int. I believe for signed dmul's the test would be extract bit 31 of r3 with an ext instruction, and it should match r2 (ie, either 0 or -1). -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Do apple growers tell their kids money doesn't grow on bushes?