Xref: utzoo comp.theory:375 comp.misc:8292 comp.lang.misc:4224 Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!psuvax1!mordor!callahan From: callahan@mordor.endor.cs.psu.edu (Paul B. Callahan) Newsgroups: comp.theory,comp.misc,comp.lang.misc Subject: Re: Modulus (Re: hashing function for strings) Message-ID: Date: 23 Feb 90 19:44:32 GMT References: <12099@goofy.megatest.UUCP> <98399@linus.UUCP> <98717@linus.UUCP> Sender: news@cs.psu.edu (Usenet) Organization: Johns Hopkins University Computer Science (posting from PSU) Lines: 62 In article <98717@linus.UUCP> bs@linus.UUCP (Robert D. Silverman) writes: > >Oh, for blanks sake. When are you CS weenies going to learn some math? . . . >Choosing a,b \in Z --> a/b = floor[a/b] is the best, >mathematically consistent way to handle integer division. [this gives the correct definition: a mod b = a-b*floor[a/b]] Here's at least one CS weenie that agrees with you. On at least one occasion I've gone crazy trying to track down errors related to the incorrect implementation of the % (so called modulus) operator in C. I have this funny notion in my head that "a mod b" ought to be in the range 0 to b-1. I remember once thinking after a core dump "hmmm... maybe '%' is returning a negative value... naaah, they couldn't have been that dumb; it must be my fault," and wasting another half hour trying to track down a non-existent bug. When I first learned that floor[x] was the largest integer less than or equal to x (a long time ago, in high school) it did strike me as strange, since my intuitive notion of turning a real into an integer involved throwing away all the digits to the right of the decimal point. After seeing enough mathematics, I was ultimately convinced that my intuition was at fault. That is, the idea of digit truncation is an _artifact_ of the way we _represent_ numbers, and is not elegantly defined in terms of the numbers themselves (as many have pointed out, you have to throw in these nasty absolute values to make it work). On the other hand, as a computer weenie, I like to have direct access to the operations the machine is actually performing. If we assume that floating point arithmetic is done using a sign-magnitude representation then it is reasonable to conclude that when the machine coerces a float to an integer, it does so by throwing away the digits to the right of the decimal point. This is somewhat easier to implement, though it gives an incorrect value for floor. Assuming the machine is giving out this value for free, I would rather have the *option* of using it, than be *forced* to use a version of floor or mod which is correct but requires a software test. When I say I would like to have the option, this is not to say that I would be inclined to exercise the option often. Normally, I would be quite happy to pay a time penalty if it meant I could remember exactly what the function is supposed to return. A nice and easy way to get around this problem is to define your own "defensive" mod procedure, which assumes that mod only works for positive numbers. The obvious way to do this involves an extra software test. If you're careful, it can be made completely portable. Incidentally, the way I got around this problem in practice was simply to add a big multiple of b to a when computing "a mod b." That is, a mod b = kb+a mod b So, if we choose k large enough to insure kb+a is always positive, then we get the correct value for mod. As long as b is constant, we can precompute kb. Admittedly, this is a kludge, but it is probably faster than sticking in an extra "if." Paul Callahan callahan@crabcake.cs.jhu.edu callahan@psuvax1.cs.psu.edu