Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!purdue!mentor.cc.purdue.edu!l.cc.purdue.edu!cik From: cik@l.cc.purdue.edu (Herman Rubin) Newsgroups: comp.arch Subject: Re: RISC as a "technology window"? Summary: Integer arithmetic is similar to fp Message-ID: <1188@l.cc.purdue.edu> Date: 25 Mar 89 13:09:23 GMT References: <1552@vicom.COM> <15690@cup.portal.com> <1562@vicom.COM> <717@m3.mfci.UUCP> Organization: Purdue University Statistics Department Lines: 65 In article <717@m3.mfci.UUCP>, rodman@mfci.UUCP (Paul Rodman) writes: > In article <22974@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > > > >So, my question is: If you ASSUME that you have to have high speed arithmetic, > >what is the best way to partition functions between chips? I believe that the > >best way is Control, ALU/FPU, and instruction cache on one chip, and data > >cache/MMU on another chip. Why doesn't the market agree with me? > > > > Personally, I think the optimal partitioning for large f.p. problems would > be to split the f.p. unit and registers onto another chip. The amount of > comms required between the integer domain and floating domain is very > small and extra cycles to go from one to the other aren't a problem (speaking > from the our experience with partition the cpu in just this way). > > I haven't > thought about how to solve the problems in splitting integer data caches > and floating data caches, but I'm sure there would be an acceptable solution. > Assuming your compiler guys are up to it , :-) > > The main advantage here are: < < - You can get more pins for the f.p. chip for more loads/stores per < clock on the f-unit. Also you can get more than 16 d.p. registers < (which isn't enough, in our experience for two piped fu's). < < - The i-chip, which made no use of the funit hardware, has more area < for integer goodies, including a larger on-chip data cache for < integer data. I would rather have the MMU on this chip to make sure < that the memory pipeline for explicit loads is one cycle shorter, < i.e. save a chip crossing here. < < Now the guys that don't use floating point can just buy the i-chip, those < that want screaming f.p. perf buy both. < < I just don't see the point in doing hairy-chested cramming of f.p. hardware < on the same chip as the integer stuff, when the two functional units < are so nicely seperable, to the benefit of each. I can see the point of having separate address arithmetic and low-precision multiplication for address purposes. But restricting the term "integer arithmetic" to that is destructive of computing power. I am not arguing one way or the other on partitioning functions among chips. I suspect it is a good idea, but this is not the point. A floating point operation consists of separating the sxponents from the mantissas, differencing the exponents and shifting for addition and subtraction, performing the fixed point operation, and performing the necessary shifting and exponent calculation. The cost is greatest for multiplication and division, where the similarities between fixed and floating point are greatest. Indeed, many architectures with a floating point accelerator do integer multiplication in that unit. But suppose you want high precision arithmetic, integer, fixed point, or floating point? You now want a good integer arithmetic machine; if floating point arithmetic must be used, integer arithmetic must be emulated in it, which is quite clumsy. The computational equipment for high precision multiplication and division is largely the same for integer, fixed point, and floating point. For high-precision addition and subtraction, the overlap is still great. An architecture, language, or programmer not capable of taking advantage of this must be considered limited. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)