Path: utzoo!utgpu!watmath!watdragon!watsol!tbray From: tbray@watsol.waterloo.edu (Tim Bray) Newsgroups: comp.arch Subject: How Caches Work Message-ID: <16306@watdragon.waterloo.edu> Date: 10 Sep 89 20:32:15 GMT References: <21936@cup.portal.com> <1082@cernvax.UUCP> Sender: daemon@watdragon.waterloo.edu Reply-To: tbray@watsol.waterloo.edu (Tim Bray) Organization: U. of Waterloo, Ontario Lines: 33 In article <1082@cernvax.UUCP> hjm@cernvax.UUCP (Hubert Matthews) writes: +You may be running software that has a very low cache hit rate if you +are doing CAD work or scientific calculations. Take this little loop +for example: + + SUM = 0.0 + DO 10 I = 1, 1000000 + SUM = SUM + VEC(I) + 10 CONTINUE + +A data cache is *no use at all* for this problem. You will get a +cache miss on every data access. Now hold on just a dag-blaggin' minute. I'm a software weenie who's never built a cache, but I thought I understood how they work. If this is right obviously I don't at all. Somebody who knows should either debunk this or explain what's really going on, because I'm probably not alone in my ignorance. I thought caches respected the principle of locality. And this code has really good locality. In fact, I thought they were block- or page-based. And when VEC(I) hits a page for the first time, it'll be cached, and then it'll keep hitting the cache (bar nasty context switches, etc.), until VEC(I) moves off that page. One cache miss per page; in the worst case, if SUM is DOUBLE and the page size is 512, you do 64 times as well as hitting main memory per loop iteration. Nyet? +Similarly, copying data from one bit +of memory to another will be limited by the raw memory speed. Say what? Tim Bray, New OED Project, U of Waterloo, Ontario