Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!rutgers!sri-spam!mordor!lll-lcc!pyramid!prls!mips!mash From: mash@mips.UUCP Newsgroups: comp.sys.m68k Subject: Re: move sr/move ccr: is bigger better? Message-ID: <110@winchester.mips.UUCP> Date: Sat, 7-Feb-87 19:53:02 EST Article-I.D.: winchest.110 Posted: Sat Feb 7 19:53:02 1987 Date-Received: Mon, 9-Feb-87 01:36:14 EST References: <809@imagen.UUCP> <561@elmgate.UUCP> <1090@msudoc.UUCP> Reply-To: mash@winchester.UUCP (John Mashey) Distribution: world Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 72 Keywords: logic? CCR Protection Memory Management In article <822@sauron.Columbia.NCR.COM> campbell@sauron.UUCP (Mark Campbell) writes: >In article <109@winchester.mips.UUCP> mash@winchester.UUCP (John Mashey) writes: >>It will be interesting to see whether people turn the data-caching on or >>not: depending on the benchmark and memory design, a tiny data cache >>can actually make a system run slower, unlike the more usual speedup from >>(even a small) I-cache. Just out of curiosity, does anybody out there >>have any simulations for a 68K with this cache design? > >I believe that this holds only if the miss penalty for the small D-cache >causes one or more wait states to be induced when referencing the missed >location(1). Since the best case access time of the MC68030 to external >memory is two wait states (synchronous mode) with or without the D-cache >I don't believe that the D-cache can cause a penalty in performance. > >If anyone can think of cases in which this might not be true (i.e., cases >in which a small D-cache can cause a performance penalty under the stated >conditions) I'd appreciated your posting examples. Thanks. Here is a very simple analysis, partially derived from what people said about the Moto presentation at ICCD in October, i.e., that the D-cache hit rate was around 50% [if this is wrongly quoted, please tell me; I was not there]. let X = number of cycles to fetch 1 word from memory outside the chip let Y = number of cycles to fetch 4 words [the way the 68030 D-cache works] let Z = number of cycles to fetch 1 word from on-chip D-cache let M = Miss rate in D-cache [0..1.0] then (grossly): cost to fetch data without cache : X cost to fetch with cache on: (1-M) * Z + M * Y (i.e., part of the time you hit, and each time it costs Z, and part of the time you miss, in which case it costs Y.) One can assume that Z < X < Y. Let's assume that Z = 0 (best case). thus, the 2 cases reduce to: cost without cache: X cost with cache: M*Y Thus, if X < M*Y, it is better not to use the cache. For example, if X is 2, and Y is 4, and M is .5, then it's equal. However, if Y is even 5, or if Z is not zero, then you do better without the cache. All of this is NOT intended to indicate real numbers, but to show that you have to compute the miss-rate, and that high miss-rates may cost you. OK, now perhaps some real examples. The 68030 D-cache contains 16 lines of 16-bytes each, direct-mapped, i.e., it wraps around each 256 bytes, so that the line starting at 0, and the line starting at 256 cannot both be present at once. IF you have programs whose access is primarily sequential, then all is well. If not, then you may continually be fetching data that doesn't get a chance to be used before it is kicked out of the cache, but which cost you cycles to get. Examples: a) Vector-processing code where the vectors line up in memory clashing with each other. b) Kernel code, which often walks all over memory looking at just a few bits in each structure. Note: the miss rates in I-caches are almost always much better than for D-caches, hence even a small I-cache usually wins, mainly due to linearity of access. Even a small D-cache will probably help function-return time, but it may not help the rest of the code much. In general, intuition on any of this is highly suspect [that's why I asked in the first place if people had simulated this particular cache on 68K address traces]. Bottom line: you must be careful with high-miss-rate caches. If there is any penalty for filling the cache (over just fetching the data), then a cache can actually reduce performance. -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086