Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!crdgw1!sixhub!davidsen From: davidsen@sixhub.UUCP (Wm E. Davidsen Jr) Newsgroups: comp.unix.i386 Subject: Re: 386 Motherboards Message-ID: <898@sixhub.UUCP> Date: 2 May 90 22:37:54 GMT References: <15966@cbnews.ATT.COM> <332@hub.cs.jmu.edu> <883@sixhub.UUCP> <292@zds-ux.UUCP> Reply-To: davidsen@sixhub.UUCP (bill davidsen) Organization: *IX Public Access UNIX, Schenectady NY Lines: 52 In article <292@zds-ux.UUCP> gerry@zds-ux.UUCP (Gerry Gleason) writes: | In article <883@sixhub.UUCP> davidsen@sixhub.UUCP (bill davidsen) write| >In article <332@hub.cs.jmu.edu> arch@hub.cs.jmu.edu (Arch Harris) writes: | > I believe there are, but my personal experience is that over 64k you | >hit diminishing returns (actually 32k does a lot). | | This fits with some tables Intel published on various cache organizations | and sizes, and with my experience. However, the data also makes it clear | that organization makes a big difference too, that is 32k 4way set-assoc. | could end up about the same hit rates as 128k direct mapped (I don't have | the data in front of me, but I remember that set-assoc. buys you quite | a bit). Also, keep in mind that those last few pecent of hit rate can | make a big difference if the main memory is very slow. A good point. I stand clarified if not actually corrected. Something which has 95% hit instead of 90% goes to main memory only half as much. Now the question is, what does that cost in terms of wait states? That's very machine dependent, but let's assume two. Then, 0.90 * 2w/s + 0.10 * 4w/s = 2.2w/s (effective) 0.95 * 2w/s + 0.05 * 4w/s = 2.1w/s A gain of 5%. If you assume 6w/s (slow but not unheard of) 0.90 * 2w/s + 0.10 * 6w/s = 2.50 0.95 * 2w/s + 0.05 * 6w/s = 2.20 A gain of 12%. Finally, if you assume 6w/s and 16 bit memory (yecch) 0.90 * 2w/s + 0.10 * 12w/s = 3.00 0.95 * 2w/s + 0.05 * 12w/s = 2.50 A gain of 17%. If you make the main memory slow enough you can justify a lot of cahce, and be making a good decision. I think the last case is pretty unlikely, but I'm told that you can get 12 clock access with 32 bit memory on access to non-interleaved memory. The question then gets cloudier when you consider that doubling the cache size doesn't double the hit rate (or more accurately halve the miss rate) for most loads. So while it's probably true to say that more cache or faster memory will always make your system faster, it's not true that any kind of memory will always be the most cost effective improvement to a system. Maybe adding an FPU or a better disk controller will do more. People are still getting master's theses from this, so we're not going to solve it in a screen or two. -- bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen) sysop *IX BBS and Public Access UNIX moderator of comp.binaries.ibm.pc and 80386 mailing list "Stupidity, like virtue, is its own reward" -me