Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!sco!seanf From: seanf@sco.COM (Sean Fagan) Newsgroups: comp.arch Subject: KM's vs. Supers (medium) Message-ID: <4328@scolex.sco.COM> Date: 7 Jan 90 08:09:34 GMT References: <34030@mips.mips.COM> <4322@nttmhs.ntt.JP> <39807@ames.arc.nasa.gov> Reply-To: seanf@sco.COM (Sean Fagan) Organization: The Santa Cruz Operation, Inc. Lines: 52 In article <39807@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: >And looking just a little further ahead, an "R9000" >(just making this up out of whole cloth) with a starting clock speed of >100MHz, scaled up to 200 MHz by 1992 or 1993, could put Cray out of business. >SGI will build scalar graphics workstations with half the power of the then >current Crays at 1/100 the cost, and Ardent will do the vector version, with >a similar price advantage. An amusing idle speculation (?) Ok. I've just sat down with a list, and tried to figure something out; hopefully, *somebody* can help me on this. My "list" was a list of timings for a CDC Cyber 170/760, running at 40MHz (thanks Brian!). From experience, I'd say that the list is correct (i.e., not a lie 8-)). Excluding loads, divides, and other memory references, average cycle count is between 2 and 3 clocks / instruction. Actually, all that was for background (and to plug Cybers and Seymour 8-)). On with the point: the Cyber is 25 years old; I seriously doubt that any of the Crays have *slower* performance (despite being 64-bit two's complement instead of 60-bit one's complement, I have a high opinion of Seymour), yet, according to Eugene Brooks, a 66MHz R6000 will outperform a Cray-2, which runs at, what, 250MHz? So, *how* does it do this? The things I could come up with were: the Cray-2 has slower cycles than the Cyber, which is frightening; the larger register count on the MIPS chip helps that much (possible, but I don't know; that's why I'm asking); and / or the R6000 has more functional units than a Cray-2. So, now for speculation of my own. Eugene has said that he thinks the supercomputers of the future will be merely bunches of KM's running in parallel; I'm not sure. I think that Seymour (and his ilk) is definitely going to have to adopt some of the advantages of the KM's; I see no doubt of that (otherwise, we *will* end up with thousands of KM's with vector processors, and this might be best; but, then again, maybe not). But which ones? Seymour has been using 8 registers per set for the last 25 or more years (I don't know about the CDC 3x00 series); would more registers allow for faster code to be generated, up to a certain point? How about register windows? I guess part of what I'm saying, and asking, is this: there is little reason why a Cray *must* be slower than a MIPS chip, and, if nothing else, there is more room on the Cray to put stuff directly in hardware (such as, oh, a 1 cycle multiply, or, better yet, a 1 cycle divide 8-)). What needs to be done, and why hasn't it been done? Sorry for the length, but this seems like a good discussion for this group to get started on. -- Sean Eric Fagan | "Time has little to do with infinity and jelly donuts." seanf@sco.COM | -- Thomas Magnum (Tom Selleck), _Magnum, P.I._ (408) 458-1422 | Any opinions expressed are my own, not my employers'.