Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!ucsd!ames!ames.arc.nasa.gov!lamaster From: lamaster@ames.arc.nasa.gov (Hugh LaMaster) Newsgroups: comp.arch Subject: Re: KM's vs. Supers (medium) Message-ID: <40049@ames.arc.nasa.gov> Date: 8 Jan 90 20:07:12 GMT References: <34030@mips.mips.COM> <4322@nttmhs.ntt.JP> <39807@ames.arc.nasa.gov> <4328@scolex.sco.COM> Sender: usenet@ames.arc.nasa.gov Organization: NASA - Ames Research Center Lines: 64 In article <4328@scolex.sco.COM> seanf@sco.COM (Sean Fagan) writes: >In article <39807@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: >Ok. I've just sat down with a list, and tried to figure something out; >hopefully, *somebody* can help me on this. >according to Eugene Brooks, a 66MHz R6000 will outperform a Cray-2, which >runs at, what, 250MHz? >So, *how* does it do this? Once upon a time, I wrote 4 benchmarks which showed that each of the following machines was faster: IBM 3033, Cray-1/S, CDC Cyber 203, and CDC 7600. I was able to do it by knowing something about the weaknesses of each machine. *It all depends on your applications.* Eugene Brooks application is unusually hard on the Cray, it appears. On the other hand, even for codes which aren't so hard on the Cray, there is now a cost advantage in many cases for the KMs even if the Cray is still much faster. >processors, and this might be best; but, then again, maybe not). But which >ones? Seymour has been using 8 registers per set for the last 25 or more >years (I don't know about the CDC 3x00 series); This is not really correct for the Crays. You can't forget about the second level scalar registers ("programmable cache") or the vector registers.. > would more registers allow He already has a lot more scalar registers than MIPSCo. Better to ask why the extra registers don't seem to produce a gross advantage in cycles per instruction, which they don't seem to. >for faster code to be generated, up to a certain point? How about register >windows? Do register windows produce fewer loads and stores? The results seem to indicate that they don't make much difference. Not that they seem to hurt, either. They are, it seems, no big deal- just another design choice. >I guess part of what I'm saying, and asking, is this: there is little >reason why a Cray *must* be slower than a MIPS chip, and, if nothing else, The Cray is generally faster. The question is, rather, is it enough faster to justify the cost. Also, don't forget that the Cray is still the fastest data engine around. More throughput than anybody else. You might even see a Cray used as a fileserver for a farm of Killer Micros someday :-) >there is more room on the Cray to put stuff directly in hardware (such as, >oh, a 1 cycle multiply, or, better yet, a 1 cycle divide 8-)). What needs >to be done, and why hasn't it been done? **************************************** Another speculation: would superscalar instruction issue of Cray scalar instructions be possible? What are the conditions necessary for issue of multiple instructions per cycle? Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117