Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!hplabs!hpda!hpisoa2!hpitg!lll-crg!brooks@lll-crg From: brooks%lll-crg@lll-crg.UUCP Newsgroups: net.arch Subject: Re: CRAY Question Message-ID: <1417@lll-crg> Date: Fri, 2-May-86 08:42:00 EDT Article-I.D.: lll-crg.1417 Posted: Fri May 2 08:42:00 1986 Date-Received: Sun, 11-May-86 15:46:00 EDT References: <905@harvard> Lines: 19 How can the Cray 1 M with slower memory be faster than a Cray 1 S for a special set of circumstances? The S has memory with a 4 clock cycle time. There are 16 banks. For stride 1 vector fetch the cycle time of the memory is 4 times faster than it really needs to be. Suppose you want good scalar performance? Suppose you want good performance on stride 4 vector fetch? Suppose you want good performance on stride 2 vector fetch? (in this case you only need 8 banks) How can a Cray 1 M be as fast? Suppose your application is stride 1 vector fetch and 99.99% vectorized. Even if the mos ram is 4 time slower, taking a 16 clock latency, the cpu will get full bandwidth. Suppose the slightly slower memory causes a missed chain slot, emabling parallel use of the adder and multiplier, to be hit. The machine with mos memory could be faster. This is of course for very special circumstances and do not look for faster performance on average. High scalar speed is where it at! Eugene