Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!samsung!noose.ecn.purdue.edu!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin From: hrubin@pop.stat.purdue.edu (Herman Rubin) Newsgroups: comp.arch Subject: Re: Hardware considerations Message-ID: <12344@mentor.cc.purdue.edu> Date: 15 May 91 10:08:59 GMT References: <28297C23.6984@tct.com> <1991May15.003712.5909@jetsun.weitek.COM> Sender: news@mentor.cc.purdue.edu Lines: 55 In article <1991May15.003712.5909@jetsun.weitek.COM>, weaver@jetsun.weitek.COM (Mike Weaver) writes: > In article <12295@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes: > >.... Now division is, admittedly, a major headache, but is > >there any good reason not to use essentially the same hardware for > >integer and floating multiplication? > One reason: the number of bits in the significand a floating point > number is less than the number of bits in the corresponding integer. > For example, a full array multiplier (a common way to make a fast > multiplier) has a size that may scale as the product of the sizes in > bits of the two inputs. Thus for IEEE floating point, I estimate that > the increase in the size of the array (of adding elements) as follows: > n n**2 m m**2 ratio > Single 24 576 32 1024 1.78 > Double 53 2809 64 4096 1.46 > n = significand bits, m = integer bits, ratio = m**2/n**2, is my > estimate in the expansion in the size of the array to make a > floating point multiplier into a integer multiplier. > Also, when you do a floating point multiply, you know you will throw > away the least significant half of the product. All you really need to > know (for IEEE) is the most significant half of the product, plus the > next three bits of the product, and whether or not the remaining bits > were zero. This can lead to some hardware savings as the individual > wires for these bits do not need to be carried through with to the > normalization stage, only a zero/non-zero indicator bit. > My point is that there is a significant cost here (but you can use > a 53x53->56 bit multiplier for 32x32->32 bit multiplier). On a decent number-crunching problem, 24-bit accuracy is not too useful. It would make far more sense to call the IEEE single precision half-precision and their double as single precision. Some hardwares recognize this by not even providing "single precision" hardware. I know of at least one which uses 48 (really 47) for full precision and 24 (really 23) for half precision, and provides access to the least significant part of the product. It also allows unnormalized arithmetic, and has essentially no separate integer arithmetic. The 11-bit IEEE exponent is also not always adequate. If any more accuracy is needed, it is NECESSARY to go to integer arithmetic. With unnormalized floating point, 32x32 -> 52 would not be too difficult, but with IEEE, there is lots of overhead, so I am not convinced that 26x26 -> 52 in the floating unit would be faster than 16x16 -> 32 in the integer uniunit. This is even more so if there are separate integer and floating registers. There are too many situations where forced normalization is a major headache. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!hrubin(UUCP)