Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!rutgers!sri-spam!mordor!lll-lcc!pyramid!prls!mips!mash
From: mash@mips.UUCP
Newsgroups: comp.sys.m68k
Subject: Re: move sr/move ccr: is bigger better?
Message-ID: <110@winchester.mips.UUCP>
Date: Sat, 7-Feb-87 19:53:02 EST
Article-I.D.: winchest.110
Posted: Sat Feb  7 19:53:02 1987
Date-Received: Mon, 9-Feb-87 01:36:14 EST
References: <809@imagen.UUCP> <561@elmgate.UUCP> <1090@msudoc.UUCP>
Reply-To: mash@winchester.UUCP (John Mashey)
Distribution: world
Organization: MIPS Computer Systems, Sunnyvale, CA
Lines: 72
Keywords: logic? CCR Protection Memory Management

In article <822@sauron.Columbia.NCR.COM> campbell@sauron.UUCP (Mark Campbell) writes:
>In article <109@winchester.mips.UUCP> mash@winchester.UUCP (John Mashey) writes:
>>It will be interesting to see whether people turn the data-caching on or
>>not: depending on the benchmark and memory design, a tiny data cache
>>can actually make a system run slower, unlike the more usual speedup from
>>(even a small) I-cache. Just out of curiosity, does anybody out there
>>have any simulations for a 68K with this cache design?
>
>I believe that this holds only if the miss penalty for the small D-cache
>causes one or more wait states to be induced when referencing the missed
>location(1).  Since the best case access time of the MC68030 to external
>memory is two wait states (synchronous mode) with or without the D-cache
>I don't believe that the D-cache can cause a penalty in performance.
>
>If anyone can think of cases in which this might not be true (i.e., cases
>in which a small D-cache can cause a performance penalty under the stated
>conditions) I'd appreciated your posting examples.  Thanks.

Here is a very simple analysis, partially derived from what people said about
the Moto presentation at ICCD in October, i.e., that the D-cache hit rate
was around 50% [if this is wrongly quoted, please tell me; I was not there].

let X = number of cycles to fetch 1 word from memory outside the chip
let Y = number of cycles to fetch 4 words [the way the 68030 D-cache works]
let Z = number of cycles to fetch 1 word from on-chip D-cache
let M = Miss rate in D-cache [0..1.0]

then (grossly):
	cost to fetch data without cache : X
	cost to fetch with cache on: (1-M) * Z + M * Y
		(i.e., part of the time you hit, and each time it costs Z,
		and part of the time you miss, in which case it costs Y.)
One can assume that Z < X < Y.

Let's assume that Z = 0 (best case).  thus, the 2 cases reduce to:
	cost without cache: X
	cost with cache: M*Y
Thus, if X < M*Y, it is better not to use the cache.
For example, if X is 2, and Y is 4, and M is .5, then it's equal.
However, if Y is even 5, or if Z is not zero, then you do better without
the cache.

All of this is NOT intended to indicate real numbers, but to show that
you have to compute the miss-rate, and that high miss-rates may cost you.

OK, now perhaps some real examples.  The 68030 D-cache contains 16 lines
of 16-bytes each, direct-mapped, i.e., it wraps around each 256 bytes,
so that the line starting at 0, and the line starting at 256 cannot both
be present at once.  IF you have programs whose access is primarily sequential,
then all is well.  If not, then you may continually be fetching data
that doesn't get a chance to be used before it is kicked out of the cache,
but which cost you cycles to get.  Examples:
	a) Vector-processing code where the vectors line up in memory
	clashing with each other.
	b) Kernel code, which often walks all over memory looking at just
	a few bits in each structure.
Note: the miss rates in I-caches are almost always much better than for
D-caches, hence even a small I-cache usually wins, mainly due to linearity
of access.  Even a small D-cache will probably help function-return time,
but it may not help the rest of the code much.

In general, intuition on any of this is highly suspect [that's why I asked in
the first place if people had simulated this particular cache on 68K address
traces].

Bottom line: you must be careful with high-miss-rate caches.  If there is
any penalty for filling the cache (over just fetching the data), then
a cache can actually reduce performance.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD:  	408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086