Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think.com!spool.mu.edu!olivea!apple!voder!nsc!amdahl!JUTS!duts!kls30 From: kls30@duts.ccc.amdahl.com (Kent L Shephard) Newsgroups: comp.sys.amiga.advocacy Subject: Re: 680x0 vs 80x86 Message-ID: Date: 28 Jun 91 18:16:01 GMT References: <92@ryptyde.UUCP> <4671.tnews@templar.actrix.gen.nz> <1154@stewart.UUCP> <1991Jun25.165516.13021@mintaka.lcs.mit.edu> <1991Jun27.064123.27492@neon.Stanford.EDU> Sender: netnews@ccc.amdahl.com Reply-To: kls30@DUTS.ccc.amdahl.com (PUT YOUR NAME HERE) Organization: Amdahl Corporation, Sunnyvale CA Lines: 126 In article <1991Jun27.064123.27492@neon.Stanford.EDU> torrie@cs.stanford.edu (Evan Torrie) writes: >kls30@duts.ccc.amdahl.com (Kent L Shephard) writes: > >>In article <1991Jun25.165516.13021@mintaka.lcs.mit.edu> rjc@churchy.gnu.ai.mit.edu (Ray Cromwell) writes: >>> >>> But this is different, Jerry, because in this case the OS KNOWS >>>how to clear caches. If a lot of MS-DOG programs used self-modifying >>>programs, or if the OS itself doesn't know how to treat caches, >>>code will break. Hence, Intel probably keeping I&D unified to avoid >>>an MS-DOG nightmare. > >>Wrong. Intel decided to go with a unified cache for one because it is >>simpler to implement. Also if you have a 4 way set assoc. cache you >>have basically 4 small caches. > According to Hennesy and Patterson - Computer Architecture a Quantitive Approach, pgs 423-425. Assuming 53% references for instructions, an 8k unified cache vs a 4k instruction, 4k data cache; the results are as follows. Miss Rates SIZE Instruction only Data only Unified 8KB 5.8% 6.8% 8.3% You get an overall miss rate of 6.27% for data/instruction seperate. You get an overall miss rate of 8.3% for unified. > But you still have only one internal path from the CPU to the cache, thus >cutting your bandwidth in half vs a split I/D Harvard architecture. For an >example of why this is important, check out the parallelism in any of today's >microprocessors' pipelines. > Does Intel still use their 386 instruction prefetch buffer in the >486? I suppose that should shore up some of the performance loss from >having a unified cache. > We know both companies claim a hit rate above 90% for their caches. You also forget that replacement algor. makes a lot of difference in the hit rate. Also separate caches require that you have replacement algor. for both caches. You also need hardware for both control circuits. You need two sets of tag rams, etc. Intel made a trade between 2-3% performance improvement vs. less chip area and complexity of design. They also got their product out the door a LOT faster than Motorola. >>Also in Intel processors you have >>instuctions that have data included or immediately following. Kind of >>hard to separate data and instuctions. > > An example of such an instruction? The 68K has data in its instructions, >the ADDQ #x, Dn for example, but this doesn't stop an I/D cache (since the >data is non-modifiable). > >>Moto went with a seperate cache because the architecture is different. >>The type of instructions are different. > > Moto went with separate caches because of their performance. Lets face it during design you make trade offs. Intel made one Moto made another. > >>As for self modifying code. The machines that use Moto processors are >>more guilty of this. The Mac and Atari machines uses self modifying code >>for copy protection. When Moto started putting small caches on their >>chips it created a nightmare. > > So the copy-protection schemes don't use self-modifying code anymore >[in fact, most Mac programs don't use copy-protection other than >manual-type methods]. > At least Motorola could do this, unlike Intel. Mac programs now don't use self modifying code. They did before and it broke a lot of software when Moto started putting instruction and data cache (small but there) on the 68k line of chips. > >>Self modifying code would have broken the 386 with cache. > > Not with a unified cache. Yes, with a unified cache you can break code that does weird things. With a separate cache you can break ill behaved code. > >>Also if a cache is designed properly it should be completly transparent to >>software. > > Transparent to user software, perhaps, but often the OS has >to be intimately aware of the cache, just as it has to be aware of the >TLB. The OS does not have to be aware of the cache unless it wants to turn it on or off. The CPU has to intimately know the cache. The OS does not need to know it is there. The OS needs to know about the TLB because the OS will handle page faults, loading descriptor tables, and just overall hadling of virtual memory. The OS knows nothing about a cache miss unless the page was swapped to disk. You would then get a page fault, bring the page into physical memory, then the CPU would handle the cache miss. A cache should be transparent if someone tell you otherwise they are mistaken. I've designed memory management and cache controller units. The cache controller has always been transparent to the software. Even in multiprocessor systems the cache is transparent. You would use a cache coherency protocol like MSI, MESI, MOESI, etc. and you would impliment all your algor. in hardware. >-- >------------------------------------------------------------------------------ >Evan Torrie. Stanford University, Class of 199? torrie@cs.stanford.edu >Murphy's Law of Intelism: Just when you thought Intel had done everything >possible to pervert the course of computer architecture, they bring out the 860 -- /* -The opinions expressed are my own, not my employers. */ /* For I can only express my own opinions. */ /* */ /* Kent L. Shephard : email - kls30@DUTS.ccc.amdahl.com */