Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!brutus.cs.uiuc.edu!apple!sun-barr!rutgers!phri!roy
From: roy@phri.UUCP (Roy Smith)
Newsgroups: comp.arch
Subject: Re: How Caches Work
Message-ID: <3985@phri.UUCP>
Date: 11 Sep 89 15:20:11 GMT
References: <21936@cup.portal.com> <1082@cernvax.UUCP> <16306@watdragon.waterloo.edu> <RANG.89Sep10184900@derby.cs.wisc.edu>
Reply-To: roy@phri.UUCP (Roy Smith)
Organization: Public Health Research Inst. (NY, NY)
Lines: 31

>       SUM = 0.0
>       DO 10 I = 1, 1000000
>       SUM = SUM + VEC(I)
> 10    CONTINUE
>
> A data cache is *no use at all* for this problem.  You will get a
> cache miss on every data access.  

	Well, not quite.  True, you're going to miss on all the VEC(I)
references, but you should hit on all the I and SUM references.  Rewrite
the loop as follows (and no, I don't care if I got the test-at-the-top or
test-at-the-bottom backwards)

10	IF (I .GT. 1000000) GOTO 20	# fetch I
	I = I + 1			# fetch I, store I
	SUM = SUM + VEC(I)		# fetch I, fetch SUM, store SUM
	GOTO 10
20	CONTINUE

and you can count 4 data fetches (as pseudo-commented above), all of which
should be cache hits.  This far outnumbers the 1 cache miss per loop due to
the main data array accesses.  Some FORTRAN compilers have even been known
to put constants in memory and refer to them indirectly; if that is true
here, you get another two cache hits for the 1 and the 1000000.  You could
argue, of course, that I and SUM might be in registers, but that's a whole
other argument.
-- 
Roy Smith, Public Health Research Institute
455 First Avenue, New York, NY 10016
{att,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@alanine.phri.nyu.edu
"The connector is the network"