Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!uwm.edu!uakari.primate.wisc.edu!pikes!boulder!unicads!les
From: les@unicads.UUCP (Les Milash)
Newsgroups: comp.arch
Subject: Re: How Caches Work
Message-ID: <639@unicads.UUCP>
Date: 13 Sep 89 15:39:40 GMT
References: <21936@cup.portal.com> <1082@cernvax.UUCP> <3985@phri.UUCP> <84g302iO55GB01@amdahl.uts.amdahl.com>
Reply-To: les@unicads.UUCP (Les Milash)
Organization: Unicad  Boulder, CO
Lines: 28

ok. since we're talkin about "How Caches Work" here's a question.

i realize that cache lines can be read faster than N individual read
cycles, either by
	0.	having the mem-to-cache path wider than the cache-to-
		processor path.
	1.0	doing burst (page-mode) access to a bank of dram.
	1.1	using interleaved banks.
so my question is:

who does #0?  i don't think i've seen it done in a micro (i.e. i don't think
i've seen any few-chip CMMU that comprehends (say) 16-byte wide paths to
the dram.  (i think i heard that the Lilith did fetch 8byte instructions 
into a 8byte wide x 1 deep prefetch buffer).  now i have no problem
imagining Mr. Cray and the Fast Crowd doing wide things, but in the micro
world i haven't heard of it (yet--seems obvious now that big pin-count pkgs
are starting to appear).

also, what other things that i haven't seen prevent this from being an
easy performance-boost?

the gentleman from japan who was asking about integrating cache and dram
on one chip sounded like he was onto a good idea.

faster, faster, till the thrill of gips outweighs the cost of pins!

thanks folks,
Les Milash