Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!hplabs!hpda!hpisoa2!hpitg!lll-crg!brooks@lll-crg
From: brooks%lll-crg@lll-crg.UUCP
Newsgroups: net.arch
Subject: Re: CRAY Question
Message-ID: <1417@lll-crg>
Date: Fri, 2-May-86 08:42:00 EDT
Article-I.D.: lll-crg.1417
Posted: Fri May  2 08:42:00 1986
Date-Received: Sun, 11-May-86 15:46:00 EDT
References: <905@harvard>
Lines: 19

How can the Cray 1 M with slower memory be faster than a Cray 1 S for
a special set of circumstances?

The S has memory with a 4 clock cycle time.  There are 16 banks.  For
stride 1 vector fetch the cycle time of the memory is 4 times faster than
it really needs to be.  Suppose you want good scalar performance?  Suppose
you want good performance on stride 4 vector fetch?  Suppose you want
good performance on stride 2 vector fetch? (in this case you only need 8
banks)

How can a Cray 1 M be as fast?  Suppose your application is stride 1 vector
fetch and 99.99% vectorized.  Even if the mos ram is 4 time slower, taking
a 16 clock latency, the cpu will get full bandwidth.  Suppose the slightly
slower memory causes a missed chain slot, emabling parallel use of the
adder and multiplier, to be hit.  The machine with mos memory could be faster.
This is of course for very special circumstances and do not look for faster
performance on average.  High scalar speed is where it at!

							Eugene