Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!elroy!orion.cf.uci.edu!uci-ics!ucla-cs!marc From: marc@oahu.cs.ucla.edu (Marc Tremblay) Newsgroups: comp.arch Subject: Re: RISC as a "technology window"? Message-ID: <22202@shemp.CS.UCLA.EDU> Date: 24 Mar 89 17:15:59 GMT References: <1552@vicom.COM> <15690@cup.portal.com> <1562@vicom.COM> <15702@clover.ICO.ISC.COM> <27681@apple.Apple.COM> <15695@winchester.mips.COM> <22974@ames.arc.nasa.gov> <51@microsoft.UUCP> Sender: news@CS.UCLA.EDU Reply-To: marc@cs.ucla.edu (Marc Tremblay) Organization: UCLA Computer Science Department Lines: 59 In article <51@microsoft.UUCP> w-colinp@microsoft.uucp (Colin Plumb) writes: >lamaster@ames.arc.nasa.gov (Hugh LaMaster) wrote: >> So, my question is: If you ASSUME that you have to have high speed >> arithmetic, what is the best way to partition functions between chips? >> I believe that the best way is Control, ALU/FPU, and instruction cache >> on one chip, and data cache/MMU on another chip. Why doesn't the market >> agree with me? I also believe that putting the Integer unit and the FPU on the same chip makes sense. These two units have to communicate quickly, possibly sharing registers, and the FPU depends on the core section for its flow of instructions. I think that the trend is toward putting them on the same chip anyway. Floating-point coprocessors were very detached from the processor when they first came out (although surprisingly enough the 8087 was a little closer), especially when you think that just setting up FPU instructions could take around 10 cycles! The MIPS approach, i.e. to make the coprocessor (R3010) closely coupled is a huge improvement, especially regarding the instruction-issuing overhead. The new trend? Because the FPU needs the core unit then put it on-chip, (both Motorola 88000 and Intel i860 have the FPU on-chip). Since you *currently* have to go off-chip to access reasonably large caches, you might as well put the MMU with the caches. The idea of Hugh LeMaster's comment above, may introduce problems for accessing the instruction cache though, especially if it is physical. >Well, given that latency to memory is a serious problem these days, and >that MMU address translation is often on the critical path, moving >it off-chip doesn't sound like such a good idea. My reasoning is: access to reasonable cache -> need to go off-chip MMU is used to access cache -> need to go off-chip since you need to go off-chip anyway -> put MMU off-chip floating-point computations -> can be done internally FPU *needs* the integer unit -> put it close to the processor close to the processor -> at least closely coupled, better on-chip. >I've said it before: I'm *astounded* nobody else has used this idea. >It's such a great Win. Cache control is the custom bit, so do it >in custom logic. With all the rest of the custom logic: on the >microprocessor. Cache RAM is very generic. So don't re-invent the >wheel. FPU is also quite custom! :-) --> put it on the same chip! >Has anyone out there (other than MIPS, of course) considered this scheme >and then rejected it? Is my enthusiasm blind to some Great Problem? I think that one of the reasons why some companies have rejected it is that the size of a chip with integer + FPU is HUGE. The R3010, a great FPU coprocessor, with all its custom logic and its 75000 transistors is quite large (about 8.4 * 8.8 mm) especially when you compare it to a MMU. It is easier (in terms of area) to put an MMU on-chip than a FPU on-chip, at least for a good FPU! Marc Tremblay marc@CS.UCLA.EDU Computer Science Department, UCLA