Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!rpi!bu.edu!purdue!mentor.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: What about hardware implementations of optimized algorithms?
Summary: Fast multiplication; asymptotics and what to do for a given
	 precision on a given machine
Message-ID: <2522@l.cc.purdue.edu>
Date: 7 Sep 90 12:36:07 GMT
References: <4047@husc6.harvard.edu> <889@dg.dg.com> <908@intelisc.isc.intel.com>
Organization: Purdue University Statistics Department
Lines: 26

In article <908@intelisc.isc.intel.com>, hays@isc.intel.com (Kirk Hays) writes:
> In article <889@dg.dg.com> publius@dg-gw.dg.com (Publius) writes:
> >In article <4047@husc6.harvard.edu> cleary@husc9.UUCP (Kenneth Cleary) writes:
> >>Knuth talks about getting faster multiplication than order-n**2 behavior.
> >>He describes some methods of getting behavior of n**(lg3) or n**1.585

			...................

> Have you a different reference for multiplication faster than O(n**1.59)?

Knuth describes faster ones.  The fastest known algorithm, asymptotically,
is the Sch"onhage-Strassen algorithm, which is O(n logn loglogn).  It is
known that this is within a few powers of loglogn of the best possible
asymptotics (Cook and Anderaa).

But this says nothing about what to do with a given precision and a
given architecture.  Even the apparently simple Karatsuba-Ofman method
runs into questions of overflow and/or sign, which can made it slower
than the usual one for a given precision.  The n**1.585 algorithm is
easily programmed, and I doubt that hardware could do much better than
a microcoded program, except for adding one bit to numbers before multi-
plication to take care of the above problem.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)