Path: utzoo!attcan!uunet!decwrl!wuarchive!cec2!news From: rick@wucs1.wustl.edu (Rick Bubenik) Newsgroups: comp.sys.intel Subject: Re: Zero wait state and caches Keywords: wait states, cache Message-ID: <1990Oct8.172021.27417@cec1.wustl.edu> Date: 8 Oct 90 17:20:21 GMT References: <188@nat-3.UUCP> Organization: Washington University, St. Louis MO Lines: 60 In article <188@nat-3.UUCP> root@nat-3.UUCP (nat-3 System Administrator) writes: > >Hello -- > > I have a 25 MHz 386 motherboard with no cache, but with >plenty of 70 ns DRAM that provides zero wait state performance. >Is there any reason that a cache would boost performance on my >machine? My (very limited, probably incorrect, software-oriented) >reasoning is NO: Your analysis is understandable, but incorrect. It turns out that when a computer is advertised as 0 wait state, what they really mean is 0 wait state when pipelined memory modules are used. Also, only reads operate with 0 wait states, writes take 1 wait state. Here's how it works: Without pipelining, memory accesses take from 2 to N cycles. In the first cycle, the CPU places the address on the bus. In the second cycle, the device either responds (if it is fast enough) or it inserts a wait state. This repeats until the device is able to respond. With pipelining, the CPU puts the address on the bus in the Nth cycle of the previous cycle. This gives the device an extra cycle within which to repond. However, writes take one more cycle than reads for reasons that I don't quite understand (and not explained in my 386 data book). Even when using pipelining, not all memory accesses can be pipelined. Your DRAM modules must be interleaved to achieve 0 wait state performance. If two back-to-back accesses to the same bank occur, no pipelining can be done since the DRAMS require a precharge time (you can't precharge a bank while that bank is being accessed). Also, the CPU only pipelines when back-to-back accesses are occurring. If the bus goes idle for any reason (such as to execute a "long" instruction), no pipelining will be done. For your 25Mhz system, the cycle time is 40ns so clearly the only way it could achieve 0 wait state performance is by using pipelining. Assuming the cache is static RAM and (approximately) 40ns or faster, it will operate with true 0 wait state performance. Also, SRAMS don't need precharging or refresh, so this also speeds access. Of course, caches are only effective on cache hits so the cache needs to be large enough to guarantee close to a 100% hit ratio to be most effective. In spite of all that was just said, I don't think that a cache will improve the performance of your system much. Most applications do many more reads than writes and most of the reads are probably going to be pipelined (due, largely, to instruction prefetch). Also, other factors, such as disk transfer and access rates, have a large impact on many applications. rick Rick Bubenik rick@cs.wustl.edu Research Associate Department of Computer Science Washington University Campus Box 1045 One Brookings Drive St. Louis, Missouri 63130-4899 (314) 726-7530