Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!wuarchive!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.arch Subject: Re: RISC vs. CISC -- SPECmarks Message-ID: Date: 7 May 91 13:26:45 GMT References: <3423@charon.cwi.nl> <11602@mentor.cc.purdue.edu> <1991Apr30.163153.18568@midway.uchicago.edu> <1991May2.162909.9165@news.arc.nasa.gov> <819@cadlab.sublink.ORG> <1991May7.061500.7485@marlin.jcu.edu.au> Sender: usenet@ee.udel.edu Organization: College of Marine Studies, U. Del. Lines: 52 Nntp-Posting-Host: perelandra.cms.udel.edu In-reply-to: csrdh@marlin.jcu.edu.au's message of 7 May 91 06:15:00 GMT >> On 7 May 91 06:15:00 GMT, csrdh@marlin.jcu.edu.au (Rowan Hughes) said: Rowan> I'm a little puzzled by the discussions involving vector vs. Rowan> risc s-scalar. Given similar hardware, and an appropriate Rowan> (vectorizable) algorithm the vector method should always be Rowan> much faster. I am not sure exactly what question you are implying in this statement. If you are just saying that a vectorizable algorithm will run faster if you vectorize it, then I agree. However, it is often the case that non-vectorizable algorithms on fast scalar machines can outperform a vector algorithm for the same problem on a vector machine of similar technology. The primary difference is usually computational complexity --- for example, Gaussian elimination for tridiagonal matrices requires O(N) work and is not vectorizable, while Cyclic reduction requires O(NlogN) work and is vectorizable. The relative performance of the algorithms is thus a balance between the extra work required by the vector algorithm and the extra performance of the vector hardware. A secondary difference concerns memory bandwidth. Most of the machines that we have been discussing have insufficient memory bandwidth to run long vector operations at full speed. Thus, algorithms that avoid excess memory accesses (like the inner product algorithm for matrix multiplies) will run faster than an algorithm of the same computational complexity that uses a standard "vector" approach (like the SAXPY in the inner loop of the outer product algorithm for matrix multiplies). Rowan> Risc s-scalar machines are still essentially SISD. Rowan> Also is a true vector machine using risc harware likely to Rowan> emerge soon? Hope my ignorance isnt too obvious. Vector instructions are also essentially SISD at the hardware level. When you execute a vector instruction on a vector machine, it is doing (almost) exactly the same thing as a Killer Micro running a tight loop feeding data into a pipelined FPU. To extend a Killer Micro to the functionality of a Cray Y/MP will still require quite a bit of work. The ability of the Cray to handle 2 vector loads, one vector store, one vector add, and one vector multiply simultaneously does not easily fit into the RISC paradigm, since it depends on the existence of multi-cycle instructions. Superscalar does not seem exactly the way to go, unless the load-store units are made independent of the integer and float units. To reproduce the functionality of the Cray Y/MP seems closer to VLIW than most of the RISC approaches in use now.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET