Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/17/84; site abic.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!cwruecmp!abic!jst
From: jst@abic.UUCP (Shack Toms)
Newsgroups: net.lang,net.lang.c
Subject: Re: Integer division
Message-ID: <738@abic.UUCP>
Date: Sat, 15-Feb-86 09:24:08 EST
Article-I.D.: abic.738
Posted: Sat Feb 15 09:24:08 1986
Date-Received: Sun, 16-Feb-86 06:11:23 EST
References: <11603@ucbvax.BERKELEY.EDU> <4917@alice.UUCP> <11671@ucbvax.BERKELEY.EDU> <731@abic.UUCP> <261@hropus.UUCP>
Organization: Allen-Bradley Co., Highland Heights, OH 44143
Lines: 102
Xref: watmath net.lang:2109 net.lang.c:7869

[Kenneth Almquist @ Bell Labs, Holmdel, NJ writes (single >)]
> > Perhaps K&R thought that the performance penalty of implementing a
> > consistent modulus (or divide) was not justified, since negative
> > integers are rarely encountered in "C" [this comment cannot be traced
> > to K&R.]  However, this performance penalty can be avoided simply by
> > declaring unsigned integers as "unsigned int".
> 
> On the VAX, the unsigned remainder function is implemented as a
> subroutine call.  The standard divide instruction cannot be used
> because the divide instruction does not work on unsigned integers.

Thanks for getting me to check this out!  On my VAX using VAXC V2.1, the
unsigned division is done with very ugly inline code...  Thats ok,
but it is incorrect ugly inline code.  Seems that it does not
correctly perform x%y where x = 0x7fffffff, y = 0xfffffff2.  It gets
the answer 1, rather than 0x7fffffff.

However, the following inline fragment should work ok.

	clrl	r1
	movl	x,r0		; load x into r0,r1 as quadword
	movl	y,r2		; y in r2, set condition codes
	bgeq	1$		; y < 2**31, ediv is ok
	cmpl	r2,r0		; y >= 2**31, we know x < 2*y
	bgtru	2$		; y > x ==> x%y == x
	subl2	r2,r0		; y <= x < 2*y ==> x%y == x-y
	brb	2$		;
1$:	ediv	r2,r0,r1,r0	; y in range for ediv, x%y put in r0
2$:				; r0 now contains x%y

For signed integers x%y,  the following seems about optimal, [this
is from the VAXC code (almost)]

	emul	#0,#0,x,r0	; place x in r0,r1 as signed quadword
	ediv	y,r0,r1,r0	; r0 now contains x%y

Comparing the two fragments, the code for the unsigned divide will
probably run about as fast as the code for the signed divide, so long as
the value of y is less than 2**31.  This is because the VAX lacks not
only an unsigned divide, but also a signed longword to quadword
conversion (except for the trick with emul).  And the signed modulus
instruction (ediv) requires a quadword dividend.

The clear winner in all of this is probably x%y, where x is unsigned
and y is a short (either signed or unsigned).  The code is then:

	clrl	r1
	movl	x,r0		; Unsigned x in r0,r1
	movzwl	y,r2		; Unsigned short y in r2 (cvtwl for short y)
	ediv	r2,r0,r1,r0	; r0 contains x%y

But enough of implemention details...

> In Ada, division truncates towards zero, and there are separate
> remainder and modulo operators.  The % operator in C is a remainder
> operator; we could have a new operator %% for modulo.  (On the other
> hand, maybe not.  We don't want C to get as large as Ada!)

I certainly agree with that last remark.  You say, however, that
the % operator is a remainder operator.  Sure, the definition is
that a/b can round up or down [except when both a and b are positive],
but a/b*b+a%b must equal a.  The problem is that this selection of
properties is shared by at least 8 possible funcions from Z X Z --> Z.
Indeed, there may be many more than 8 functions.  As I read my
K&R there is nothing wrong with a compiler evaluating division
of constants (constant folding) in a different manner
from the run-time evaluation of the same values.  Furthermore,
the preprocessor may evaluate division in compile time expressions (say in
#if statements) using a different algorithm. (These possibilities
are probably greater with cross-compilers).  Furthermore,
if the denominator is a constant power of two, then the compiler
might generate shifts and masks, and produce a result different
from that of a division of the same values had they both been stored in
variables.  Similarly, for certain values of the numerator, the
division can be optimized into a comparison with a result generated
according to its own rule.  The requirement seems to be only that,
in each case, the a/b*b+a%b == a rule must hold.  That is, whenever
an optimization can be made for an expression involving "/", that
the corresponding optimization also must be made for "%".  Giving this operator
a simple name ("remainder operator") belies its fundamental ambiguity.

A very simple solution, which is upward compatible with the current
language definition, is to define division to always round in a
precise way, and then to keep % as the remainder function.  That way,
the language need not be cluttered with a %% operator.  The problem
is to choose that definition which has the most widespread application.

Now to get back to the original subject [which way to round],
in my opinion the most useful of these eight choices for division
is made by adding the constraint that the sign of any non-zero remainder
should be equal to the sign of the divisor.  My experiences have
led me to believe that this is the most convenient choice.  These
experiences were gained largely using languages which do integer division
equivalently to: a/b = trunc(float(a)/float(b)) (i.e. the "wrong" way).

*Whenever* I have sought the result of a%b with b>0, I have wanted the
smallest non-negative remainder, regardless of the sign of a.  The
symmetric choice is to then have the sign of a%b determined by the sign
of b.  This choice may lead [as would any choice] to [marginally] increased
overhead on machines which have asymmetric support for signed vs. unsigned
integers, but it is no secret that C runs best on machines with symmetric
architectures.