Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante!mit-eddie!bu-cs!purdue!decwrl!labrea!sri-unix!garth!smryan From: smryan@garth.UUCP Newsgroups: comp.arch Subject: Re: Memory latency / cacheing / scientific programs Message-ID: <840@garth.UUCP> Date: 30 Jun 88 21:16:29 GMT References: <243@granite.dec.com> <779@garth.UUCP> <2033@pt.cs.cmu.edu> <803@garth.UUCP> <11023@ames.arc.nasa.gov> Reply-To: smryan@garth.UUCP (Steven Ryan) Organization: INTERGRAPH (APD) -- Palo Alto, CA Lines: 24 >>Actually, I don't know if a 205 even has a cache. If it does, it is well >>hidden from the CPU. I think the main memory is supposed to be as fast as >>cache memory on all these weeny machine (ha-ha). I was reminded the 205 does use an instruction stack (aka cache), hence the existence of the VSB (void stack and branch) instruction. (The only VSB I know of lives in DFBM.) >Neither the 205 nor the Cray has a cache. The philosophy is to put in >enough registers that a cache is unnecessary. The 256 registers on >the 205 were plenty for any module that I saw. Actually, for the code I ran, even 256 was too small. The usual case of program expanding to consume all available memory I guess. > The place where this >approach hurts is in scalar codes that have very frequent procedure >calls An interesting idea for the far future is to analyze subprogram calling sequences to minimise swaps. This was done for FTN200 runtime library. Typically, one swap is necessary when going from user code to the library, and then library routines share the register file, avoiding swap outs as much as possible.