Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!validgh!dgh From: dgh@validgh.com (David G. Hough on validgh) Newsgroups: comp.arch Subject: More on Linpack pivoting: isamax and instruction set design Message-ID: <396@validgh.com> Date: 13 Jun 91 21:08:18 GMT Organization: validgh, PO Box 20370, San Jose, CA 95160 Lines: 31 Refering to isamax in my previous Linpack posting brought to mind the quandary that it represents for computer instruction set architects. I believe that Cray first noticed that when LU-factoring matrices of dimension n, although saxpy operations ( X += c * Y for scalar c and vector X and Y) are n times more frequent than isamax operations (find i such that Xi has largest magnitude in X), you can gain a lot more from agressive vector hardware design and compiler technology on saxpy than on isamax, so that eventually isamax becomes the bottleneck. The heart of isamax is do 30 i = 2,n if(abs(dx(i)).le.dmax) go to 30 isamax = i dmax = abs(dx(i)) 30 continue What kinds of additional instructions belong in an instruction set architecture that is intended to be scalable, from one-chip systems with inexpensive memory, to very high performance systems implemented with multiple paths to memory and perhaps multiple functional units (integer, float, branch) that may, however, be relatively distant, so that conditional branches become quite expensive? A simple max/min won't help. How can multiple comparisons be overlapped? Can the "abs" be concealed? -- David Hough dgh@validgh.com uunet!validgh!dgh na.hough@na-net.ornl.gov