Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!rpi!bu.edu!purdue!mentor.cc.purdue.edu!l.cc.purdue.edu!cik From: cik@l.cc.purdue.edu (Herman Rubin) Newsgroups: comp.arch Subject: Re: What about hardware implementations of optimized algorithms? Summary: Fast multiplication; asymptotics and what to do for a given precision on a given machine Message-ID: <2522@l.cc.purdue.edu> Date: 7 Sep 90 12:36:07 GMT References: <4047@husc6.harvard.edu> <889@dg.dg.com> <908@intelisc.isc.intel.com> Organization: Purdue University Statistics Department Lines: 26 In article <908@intelisc.isc.intel.com>, hays@isc.intel.com (Kirk Hays) writes: > In article <889@dg.dg.com> publius@dg-gw.dg.com (Publius) writes: > >In article <4047@husc6.harvard.edu> cleary@husc9.UUCP (Kenneth Cleary) writes: > >>Knuth talks about getting faster multiplication than order-n**2 behavior. > >>He describes some methods of getting behavior of n**(lg3) or n**1.585 ................... > Have you a different reference for multiplication faster than O(n**1.59)? Knuth describes faster ones. The fastest known algorithm, asymptotically, is the Sch"onhage-Strassen algorithm, which is O(n logn loglogn). It is known that this is within a few powers of loglogn of the best possible asymptotics (Cook and Anderaa). But this says nothing about what to do with a given precision and a given architecture. Even the apparently simple Karatsuba-Ofman method runs into questions of overflow and/or sign, which can made it slower than the usual one for a given precision. The n**1.585 algorithm is easily programmed, and I doubt that hardware could do much better than a microcoded program, except for adding one bit to numbers before multi- plication to take care of the above problem. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)