Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucsd!ucsdhub!celit!ps From: ps@fps.com (Patricia Shanahan) Newsgroups: comp.arch Subject: Re: Cache Size Keywords: garbage collection, locality of reference, cache size Message-ID: <7021@celit.fps.com> Date: 28 Feb 90 16:49:57 GMT References: <7393@pdn.paradyne.com> <76700146@p.cs.uiuc.edu> <1990Feb26.022057.28461@Neon.Stanford.EDU> <8189@pt.cs.cmu.edu> <8848@boring.cwi.nl> Sender: daemon@fps.com Reply-To: ps@fps.com (Patricia Shanahan) Organization: FPS Computing Inc., San Diego CA Lines: 59 In article <8848@boring.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: >In article <8189@pt.cs.cmu.edu> koopman@a.gp.cs.cmu.edu (Philip Koopman) writes: > > So, that's why most supercomputers seem to use vector register > > files instead of caches for their vector units. > > >Well, no (depends on your definition of supercomputer of course; let us >assume vector processor). There are systems without cache that use vector >registers (Cray, NEC), or have memory to memory operations (Cyber 205). >And there are processors with cache. Possibilities are: >1. No vector registers, bypass cache (Cyber 995). >2. Vector registers, bypass cache (i know none). The FPS Model 500 is in this category. The scalar processors use caches for instructions and scalar data references. The vector processors are connected to the System Memory Bus and fetch data from memory without going through cache. The only interaction is that the data caches snoop on the bus, and stores by the vector processors cause corresponding purges in the data caches. >3. No vector registers, through cache (again, i know none). >4. Vector registers, through cache (IBM 3090, Convex, Alliant, Gould). I would add the VAX 9000 to this list, but question whether the Convex C2 belongs here. I think it is a vector registers and bypass cache system. >So, no, vector registers are not a replacement for cache. >-- >dik t. winter, cwi, amsterdam, nederland >dik@cwi.nl Vector registers and cache both perform the same basic function, of holding data that is likely to be used in future calculations close to the arithmetic unit that is going to do those calculations. They also serve to buffer between the arithmetic unit's need for individual numbers, and the memory preference for highly parallel processing of larger blocks. Caches are very good when reference patterns are basically unpredictable and recent past reference to a data item or something close to it in memory is positively correlated with probability of near future reference. They work by fetching a block surrounding a referenced item (because of spacial locality) and keeping it around for a while to exploit temporal locality. Vector memory references are not like that. The reference pattern is set at compile time and known to the compiler. During a strided fetch there is no spacial locality. There is frequently a strong negative correlation between recent past reference and near future reference because of the large scale sequential characteristics of vector processing. Any cache substantially smaller than main memory can be flooded by some vector jobs. The very big manufacturers, especially IBM, are something of an exception because they can persuade third party software vendors to do a lot of special tuning for their systems. It is usually possible, with care, to arrange a vector job so that it will fit well with a particular cache size. -- Patricia Shanahan ps@fps.com uucp : {decvax!ucbvax || ihnp4 || philabs}!ucsd!celerity!ps phone: (619) 271-9940