Xref: utzoo comp.misc:8444 comp.lang.misc:4377 comp.arch:14507 Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!apple!bionet!ames!sgi!rpw3@rigden.wpd.sgi.com From: rpw3@rigden.wpd.sgi.com (Rob Warnock) Newsgroups: comp.misc,comp.lang.misc,comp.arch Subject: Re: Modulus Message-ID: <53067@sgi.sgi.com> Date: 9 Mar 90 01:51:11 GMT References: <981@m1.cs.man.ac.uk> <12115@goofy.megatest.UUCP> <2573@castle.ed.ac.uk> <1710@tws8.cs.tcd.ie> Sender: rpw3@rigden.wpd.sgi.com Reply-To: rpw3@sgi.com (Rob Warnock) Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 48 In article <1710@tws8.cs.tcd.ie> emcmanus@cs.tcd.ie (Eamonn McManus) writes: +--------------- | ...If a programming language supported | this definition of negative division, its compilers could optimize | divisions by powers of two into shifts without having to know whether the | quantity being divided was negative. On many (most?) processors shifts are | faster than divisions even by powers of two. [Unsupported assertion.] +--------------- This keeps coming up (comp.arch.topics.perennial anyone?). (The first time I ran into it was in the early 1970's when the PDP-10 Fortran compiler got the obvious bug fixed.) These days, all good compilers know how to optimize this anyway, with a pre- or post-shift correction step that makes this work "right" even for negative dividends. In C, with post-shift correction: #define DIVISOR (1 << SHIFT) #define MASK (DIVISOR - 1) result = (dividend >> SHIFT); if (dividend < 0 && (dividend & MASK) != 0) result += 1; or with pre-shift correction: if (dividend < 0) dividend += MASK; result = (dividend >> SHIFT); The really cute trick is that for most machines this can be done in just a few extra cycles *without* a branch (and therefore without a pipeline break) by computing "MASK" from the dividend's sign bit. For example, on the Am29000, to divide register "a0" by a power-of-2 constant giving "v0" costs four cycles total (most other architectures have similar short sequences): sra v0, a0, SHIFT - 1 ; replicate the sign-bit by the shift-width srl v0, v0, 32 - SHIFT ; right-justify. t0 = (a0 < 0)? MASK : 0 add v0, v0, a0 ; correct the dividend sra v0, v0, SHIFT ; do the division -Rob ----- Rob Warnock, MS-9U/510 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311