Path: utzoo!attcan!uunet!cs.utexas.edu!rutgers!apple!versatc!mips!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: Caches Message-ID: <22220@winchester.mips.COM> Date: 25 Jun 89 05:26:07 GMT References: <799@acorn.co.uk> <95@altos86.Altos.COM> <41770@bbn.COM> <25114@shemp.CS.UCLA.EDU> Reply-To: mash@mips.COM (John Mashey) Organization: MIPS Computer Systems, Inc. Lines: 59 In article <25114@shemp.CS.UCLA.EDU> frazier@cs.ucla.edu (Greg Frazier) writes: >In article <41770@bbn.COM> slackey@BBN.COM (Stan Lackey) writes: >>Looking at the RISC trend, it seems natural to assume that the next >>step is to have a writeback cache with no "snooping" (as it has been >>called) for either I/O reads OR writes, and solve the problem in >>software. >> >>-Stan > >I really don't want to start a "I know what RISC _really_ is" >sort of argument, but the RISC philosophy would only put the >cache consistancy functions in software if that made the system >faster. The basic idea of RISC is hardware minimization -> speed, >not hardware minimization for the sake of minimization. Since >one of the keys to high-speed computing is keeping the memory >"close" to the processor, I doubt moving the caching functions >to software would ever be a win. You might be surprised; in some cases, it's a perfectly reasonable tradeoff. For example, although R3000s permit external invalidation of the data cache (to allow I/O input coherency, for example), about the only systems that use it are multiprocessors, which want it for other reasons. Of course, the primary data cache is write-thru, which means you don't have to flush it out ot memory. Suppose you had a system with a 1-level cache, and the choice of snooping the I/O bus, or not: Snoop: extra hardware watches I/O for memory writes, and either stalls the CPU to snoop, or snoops in (extra) set of duplicate tags. Hardware cost: some Performance cost: whatever degradation of CPU happpens from times it wants to access data cache and snooper is in there doing something. No snoop: Hardware cost: none Performance cost: operating system needs to flush a cache page either before or after doing a read, or it's got to use uncached accesses to retrieve the data (which is usually slower, for linesize>1 word, but also doesn't pollute the cache), or it can get sneaky, like using timestamp algorithms and occasionally flushing the cache to "Clean" the entire freelist, which can then absorb DMA inputs without requiring further flushes. Depending on the kind of system you're doing, you can probably justify either answer. However, I think you'll find that the extra hardware can often be hard to justify, at least in smaller systems. [Numbers-running left as exercise for the reader. 10 points :-)] -- -john mashey DISCLAIMER: UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086