Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!usc!apple!sun-barr!newstop!sun!amdahl!mat From: mat@uts.amdahl.com (Mike Taylor) Newsgroups: comp.arch Subject: Re: How Caches Work Message-ID: <84g302iO55GB01@amdahl.uts.amdahl.com> Date: 12 Sep 89 20:14:23 GMT References: <21936@cup.portal.com> <1082@cernvax.UUCP> <3985@phri.UUCP> Organization: Amdahl Corporation, Sunnyvale CA Lines: 39 In article <3985@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: > > SUM = 0.0 > > DO 10 I = 1, 1000000 > > SUM = SUM + VEC(I) > > 10 CONTINUE > > > > A data cache is *no use at all* for this problem. You will get a > > cache miss on every data access. > > Well, not quite. True, you're going to miss on all the VEC(I) > references, but you should hit on all the I and SUM references. Rewrite > the loop as follows (and no, I don't care if I got the test-at-the-top or > test-at-the-bottom backwards) > > 10 IF (I .GT. 1000000) GOTO 20 # fetch I > I = I + 1 # fetch I, store I > SUM = SUM + VEC(I) # fetch I, fetch SUM, store SUM > GOTO 10 > 20 CONTINUE > > and you can count 4 data fetches (as pseudo-commented above), all of which > should be cache hits. This far outnumbers the 1 cache miss per loop due to > the main data array accesses. Some FORTRAN compilers have even been known > to put constants in memory and refer to them indirectly; if that is true > here, you get another two cache hits for the 1 and the 1000000. You could > argue, of course, that I and SUM might be in registers, but that's a whole > other argument. All perfectly right, but the cache miss rate also depends on the cache line size - how many words are fetched per miss - which might be (say) 128 bytes or 16 words. In this case, you'd get a miss rate of 1/16 on the vec(i) references. Also, some mainframe caches are smart enough to look ahead and prefetch the next sequential line (usually only on I-fetch, but sometimes on data as well). This could result in an actual miss rate of zero for the example. Even without this, you have an operand miss rate of 1/80 or .0125. Quite respectable. -- Mike Taylor ...!{hplabs,amdcad,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]