Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!uwm.edu!uakari.primate.wisc.edu!pikes!boulder!unicads!les From: les@unicads.UUCP (Les Milash) Newsgroups: comp.arch Subject: Re: How Caches Work Message-ID: <639@unicads.UUCP> Date: 13 Sep 89 15:39:40 GMT References: <21936@cup.portal.com> <1082@cernvax.UUCP> <3985@phri.UUCP> <84g302iO55GB01@amdahl.uts.amdahl.com> Reply-To: les@unicads.UUCP (Les Milash) Organization: Unicad Boulder, CO Lines: 28 ok. since we're talkin about "How Caches Work" here's a question. i realize that cache lines can be read faster than N individual read cycles, either by 0. having the mem-to-cache path wider than the cache-to- processor path. 1.0 doing burst (page-mode) access to a bank of dram. 1.1 using interleaved banks. so my question is: who does #0? i don't think i've seen it done in a micro (i.e. i don't think i've seen any few-chip CMMU that comprehends (say) 16-byte wide paths to the dram. (i think i heard that the Lilith did fetch 8byte instructions into a 8byte wide x 1 deep prefetch buffer). now i have no problem imagining Mr. Cray and the Fast Crowd doing wide things, but in the micro world i haven't heard of it (yet--seems obvious now that big pin-count pkgs are starting to appear). also, what other things that i haven't seen prevent this from being an easy performance-boost? the gentleman from japan who was asking about integrating cache and dram on one chip sounded like he was onto a good idea. faster, faster, till the thrill of gips outweighs the cost of pins! thanks folks, Les Milash