Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante!mit-eddie!bu-cs!purdue!decwrl!labrea!sri-unix!garth!smryan
From: smryan@garth.UUCP
Newsgroups: comp.arch
Subject: Re: Memory latency / cacheing / scientific programs
Message-ID: <840@garth.UUCP>
Date: 30 Jun 88 21:16:29 GMT
References: <243@granite.dec.com> <779@garth.UUCP> <2033@pt.cs.cmu.edu> <803@garth.UUCP> <11023@ames.arc.nasa.gov>
Reply-To: smryan@garth.UUCP (Steven Ryan)
Organization: INTERGRAPH (APD) -- Palo Alto, CA
Lines: 24

>>Actually, I don't know if a 205 even has a cache. If it does, it is well
>>hidden from the CPU. I think the main memory is supposed to be as fast as
>>cache memory on all these weeny machine (ha-ha).

I was reminded the 205 does use an instruction stack (aka cache), hence the
existence of the VSB (void stack and branch) instruction. (The only VSB I
know of lives in DFBM.)
 
>Neither the 205 nor the Cray has a cache.  The philosophy is to put in
>enough registers that a cache is unnecessary.  The 256 registers on
>the 205 were plenty for any module that I saw.

Actually, for the code I ran, even 256 was too small. The usual case of
program expanding to consume all available memory I guess.

>                                                The place where this
>approach hurts is in scalar codes that have very frequent procedure
>calls

An interesting idea for the far future is to analyze subprogram calling
sequences to minimise swaps. This was done for FTN200 runtime library.
Typically, one swap is necessary when going from user code to the library,
and then library routines share the register file, avoiding swap outs
as much as possible.