Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!purdue!mentor.cc.purdue.edu!l.cc.purdue.edu!cik From: cik@l.cc.purdue.edu (Herman Rubin) Newsgroups: comp.arch Subject: Re: benchmark for evaluating extended precision Summary: Use an appropriate test, using machine instructions Keywords: extended precision,multiply,benchmark,arithmetic Message-ID: <2550@l.cc.purdue.edu> Date: 13 Sep 90 13:05:35 GMT References: <3989@bingvaxu.cc.binghamton.edu> <1990Sep12.223253.9574@csc.ti.com> Distribution: usa Organization: Purdue University Statistics Department Lines: 31 In article <1990Sep12.223253.9574@csc.ti.com>, bmk@csc.ti.com (Brian M Kennedy) writes: > =>It has been claimed that a lack of 32x32->64 multiplication > =>makes a factor of 10 difference in the running time of > =>typical extended precision arithmetic routines. Although it > =>obviously makes _a_ difference in run time I do not measure > =>an order of magnitude difference. ............................ > Instead I will measure > an upper-bound on the performance increase by comparing: > > 64*64->64 via 32*32->32 vs. 32*32->32 [Long description deleted.] The original problem was 32x32 -> 64 compared to 32x32 -> 32. To do a reasonable type of test, consider the general problem of NxN -> 2N vs. NxN -> N. Now to do this properly, one should remember that in the machine with NxN -> N, N is the length available. Thus, in adding two N-bit numbere, one must use a test-for-carry to detect a bit in position N (starting the count from 0). Also, the comparison should not depend on the peculiarities of a particular compiler, but should be done at the machine-language level. This is not a long code. To carry out the benchmark, one could use N = 16 (or even 8) to get a general idea. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)