Xref: utzoo comp.theory:375 comp.misc:8292 comp.lang.misc:4224
Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!psuvax1!mordor!callahan
From: callahan@mordor.endor.cs.psu.edu (Paul B. Callahan)
Newsgroups: comp.theory,comp.misc,comp.lang.misc
Subject: Re: Modulus (Re: hashing function for strings)
Message-ID: <E*er-q@cs.psu.edu>
Date: 23 Feb 90 19:44:32 GMT
References: <Er$cfg@cs.psu.edu> <12099@goofy.megatest.UUCP> <98399@linus.UUCP> <E5$4$m@cs.psu.edu> <98717@linus.UUCP>
Sender: news@cs.psu.edu (Usenet)
Organization: Johns Hopkins University Computer Science (posting from PSU)
Lines: 62

In article <98717@linus.UUCP> bs@linus.UUCP (Robert D. Silverman) writes:
>
>Oh, for blanks sake. When are you CS weenies going to learn some math?
                .     .    .
>Choosing a,b \in Z --> a/b = floor[a/b] is the best,
>mathematically consistent way to handle integer division. 

[this gives the correct definition: a mod b = a-b*floor[a/b]]

Here's at least one CS weenie that agrees with you.  On at least one occasion
I've gone crazy trying to track down errors related to the incorrect
implementation of the % (so called modulus) operator in C.  I have this 
funny notion in my head that "a mod b" ought to be in the range 0 to b-1.
I remember once thinking after a core dump "hmmm... maybe '%' is returning
a negative value... naaah, they couldn't have been that dumb; it must
be my fault," and wasting another half hour trying to track down a 
non-existent bug.

When I first learned that floor[x] was the largest integer less than
or equal to x (a long time ago, in high school) it did strike me as strange, 
since my intuitive notion of turning a real into an integer involved throwing
away all the digits to the right of the decimal point.  After seeing
enough mathematics, I was ultimately convinced that my intuition was at fault.

That is, the idea of digit truncation is an _artifact_ of the way we 
_represent_ numbers, and is not elegantly defined in terms of the
numbers themselves (as many have pointed out, you have to throw in 
these nasty absolute values to make it work).

On the other hand, as a computer weenie, I like to have direct access
to the operations the machine is actually performing.  If we assume that
floating point arithmetic is done using a sign-magnitude representation
then it is reasonable to conclude that when the machine coerces a float
to an integer, it does so by throwing away the digits to the right of the
decimal point.  This is somewhat easier to implement, though it gives
an incorrect value for floor.  Assuming the machine is giving out this
value for free, I would rather have the *option* of using it, than be *forced* 
to use a version of floor or mod which is correct but requires a software
test.

When I say I would like to have the option, this is not to say that I
would be inclined to exercise the option often.  Normally, I would
be quite happy to pay a time penalty if it meant I could remember 
exactly what the function is supposed to return.  A nice and easy
way to get around this problem is to define your own "defensive"
mod procedure, which assumes that mod only works for positive numbers.  
The obvious way to do this involves an extra software test.  If you're 
careful, it can be made completely portable.

Incidentally, the way I got around this problem in practice was 
simply to add a big multiple of b to a when computing "a mod b."
That is, 
                    a mod b = kb+a mod b

So, if we choose k large enough to insure kb+a is always positive, then
we get the correct value for mod.  As long as b is constant, we can
precompute kb.  Admittedly, this is a kludge, but it is probably faster 
than sticking in an extra "if." 

Paul Callahan
callahan@crabcake.cs.jhu.edu
callahan@psuvax1.cs.psu.edu