Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!uakari.primate.wisc.edu!aplcen!boingo.med.jhu.edu!haven.umd.edu!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.arch Subject: Re: More on Linpack pivoting: isamax and instruction set design Message-ID: Date: 21 Jun 91 21:37:10 GMT References: <396@validgh.com> <1991Jun13.234834.22970@neon.Stanford.EDU> <1991Jun14.134338.4673@linus.mitre.org> <6357@goanna.cs.rmit.oz.au> Sender: usenet@ee.udel.edu Organization: College of Marine Studies, U. Del. Lines: 25 Nntp-Posting-Host: perelandra.cms.udel.edu In-reply-to: mac@gold.kpc.com's message of 21 Jun 91 17:24:05 GMT >>>>> On 21 Jun 91 17:24:05 GMT, mac@gold.kpc.com (Mike McNamara) said: Mike> [....] the discussion is about linpack's IDAMAX routine. I Mike> think the current champion hardware for this loop is the ardent Mike> titan vector unit. It has the DRAMAX instruction. This is Mike> translated Double precision Reduction Absolute MAXimum value Mike> instruction. Run over a vector, it returns in a scalar the Mike> maximum absolute value of that vector. ( The machine supports Mike> the single precision, minimum absolute value, and non absolute Mike> variants, just so no one would think the instruction was Mike> inserted just for Dongarra. :-) The Cyber 205/ETA-10 contain an instruction which returns *both* the extreme value and its index. It can optionally ignore the sign bit. Timing is something like 1 cycle per element plus 16 cycles for every new maximum found. On one of the Livermore Loops we saw a very large speedup going from double-precision to single precision. It turns out that the vector to be tested was a *very* slowly increasing function, so that the 64-bit version found lots more new maxima than the 32-bit version.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET