Path: utzoo!attcan!uunet!seismo!sundc!pitstop!sun!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.arch Subject: Re: RISC v. CISC --more misconceptions Message-ID: <623@quintus.UUCP> Date: 3 Nov 88 04:45:46 GMT References: <156@gloom.UUCP> <18931@apple.Apple.COM> <40@sopwith.UUCP> <19762@apple.Apple.COM> <1002@l.cc.purdue.edu> <19811@apple.Apple.COM> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 49 In article <19811@apple.Apple.COM> baum@apple.UUCP (Allen Baum) writes: [Talking about integer multiplication and division.] >I'm going further than that. I'm saying they are rare because the are >unnecessary. They are rare because in the USUAL case they can be strength >reduced to additions by an optimizing compiler. This is faster than using >the obvious multiply instruction. Did you notice the implicit assumption that multiplications are only for address calculations? Avoidable multiplications are rare because a generation of programmers has been brainwashed that Hardware Rules, and if some potentially useful operation is expensive it is their job to avoid it rather than have the hardware and compiler people get it right. People are still avoiding procedure calls (and RISC designers are assuming that procedure calls are not deeply nested) because old designs made procedure calls expensive. The one which is really painful is division. When one codes up a hash table, one knows (having read the literature) that remainder with a prime is a Good Thing. But one also knows that whizzbang machine X has no hardware support for division, so to avoid a subroutine call to a routine not known for its speed one sighs, puts in X & 4095 (instead of X1 % 4097 or whatever), and wishes... But to be realistic about this, let's compare a couple of CISCs with what a good RISC might do. The issue is not absolute speed, but the intensity of the temptation to distort your code to avoid a function perceived as expensive. I measure this as cost/(cost of ADD). MC68020 80386 generic R2000 WBMX 88k MULS.L/ADD.L ~ 20 ~ 5-10 ~18 14 ~44 4 DIVS.L/ADD.L ~ 45 ~ 20 ~35 35 ~150 39+possible trap MC68020 figures from manufacturer's manual 80386 figures from manufacturer's manual generic figures assume 2-bit-at-a-time multiply step, 1-b.a.a.t. divide step WBMX multiply from Whizzbang Ltd's manual, divide _estimated_ from manual; figures include procedure call overhead. WBMX has no divide step. 88k figures from article <4759@pdn.UUCP> (Alan Lovejoy) R2000 figures from article <7472@winchester.mips.COM> (Charlie Price) (1 for main op, + delay time 12 or 33, + 1 to pick up result) The 80386 and 88k multiplies deliver 32 bits, the others 64. The R2000 figures are worst case: other integer operations can be overlapped with all but two of these cycles. It would be intersting to know how often this pays off. The bottom line is that architectures should _support_ the operations programmers find useful, but that some architects have shown that good enough support can be had by doing part of an operation in hardware, part in software. Too bad about Whizzbang.