Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!rpi!sarah!bingnews!kym From: kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) Newsgroups: comp.arch Subject: Re: RISC integer multiply/divide (was Re: Snake) Message-ID: <1991Apr7.182332.15121@bingvaxu.cc.binghamton.edu> Date: 7 Apr 91 18:23:32 GMT References: <1991Apr4.213550.8106@bingvaxu.cc.binghamton.edu> <1991Apr5.191136.21806@bingvaxu.cc.binghamton.edu> <3276@charon.cwi.nl> Organization: State University of New York at Binghamton Lines: 32 In article <3276@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: >Might be, however, it is better to have correct code than fast code. Sure. But I am not sure it is required for the purposes of comparison (better to compare rotten apples if them's all you got). As you pointed out my routines didn't handle integer overflow in an intermediate result. But as the same is true of both routines fixing same would not affect the comparison too much (if measurable at all). As I pointed out before, the main point of the exercise was to test the claims (a) that you need 17x17->34 arithmetic to implement this technique, and (b) that it isn't (ever) worth the trouble. I think I've illustrated, despite problems, that (a) is not the case. While (b) appears true in some cases there are extant architectures in which it is not correct. I realize the SS1 already performs 32x32->64 multiply, this is really beside the point. 32x32->64 was what I benchmarked. The same should be approximately true of 3m vs 4m algorithms for larger operands. Although I've tried to kid others into writing up the appropriate code (to check 64x64->128 for instance) no-one has volunteered. My prediction from the extant data is that 3m/4m on the SS1 will be about 10% on average in favor of 3m for a 64x64->128 multiply with peaks at 25% (signed or not, correct or not). Anyone care to check it? -- -kym main(){int i=0,b=0,c=0,n=0;while(i<1920){if((b>>=1)<2) b=n*n++;printf("\n%c"+!!(++i%80)," #"[(c^=1)^(b&1)]);}}