Path: utzoo!mnetor!uunet!husc6!rutgers!sdcsvax!ucbvax!decvax!decwrl!spar!hunt From: hunt@spar.SPAR.SLB.COM (Neil Hunt) Newsgroups: comp.lang.c Subject: Re: C machine Message-ID: <91@spar.SPAR.SLB.COM> Date: 1 Jan 88 21:22:45 GMT References: <7535@alice.UUCP> <8226@steinmetz.steinmetz.UUCP> <461@auvax.UUCP> <9961@mimsy.UUCP> <166@teletron.UUCP> Reply-To: hunt@spar.UUCP (Neil Hunt) Organization: SPAR - Schlumberger Palo Alto Research Lines: 73 Summary: Justification for 32 bit ints on 68k machines. In article <166@teletron.UUCP> andrew@teletron.UUCP (Andrew Scott) writes: > >[...] Our 68000 compiler has 16 bit shorts, 32 bit longs (which >make sense) and 32 bit ints (which doesn't always make sense). > >A lot of code I've come across uses scratch variables (array indices etc.) of >type int. Of course, 32 bit arithmetic must be used. Since the 68000 has 32 bit registers, there is frequently a penalty on operations in 16 bits - what does the compiler do about the other 16 bits in the registers ? At least in the Sun compilers, it is very hard to persuade the compiler not to put an extend `extl dn' instruction after every load of a short variable into a register and a clear `moveq #0 dn' instruction before each load of an unsigned short value into a register. Another (perhaps less defendable) reason is that a lot of code tends to be rather cavalier about exchanging pointers and ints, (particularly in function return values, for example), and a 16 bit int would break all of this code. >However, the 68000 has >16 bit divide and multiply instructions, which are *much* faster than the >subroutine calls to the 32 bit arithmetic routines. The case could be made >that a 16 bit quantity is the "natural" size for arithmetic operations for >the 68000. Indeed the 68000/8/10/12 have a 16x16->32 bit multiply instruction, and a function is required for a long multiply. Note however that in the case that the operands would have fitted into 16 bits, this fact is quickly discovered and the short multiply is used instead: jsr lmult ; 20 lmult: ; d0 and d1 are the operands. movl d2,sp@- ; 14 movl d0,d2, ; 4 orl d1,d2 ; 6 ; OR all the bits together. clrw d2 ; 4 ; mask bits 0..15, leaving 16..31. tstl d2 ; 4 bnes ... ; 6 ; if 16..31 are not zero, branch to ... mulu d1,d0 ; 40 ; do the simple multiply movl sp@+,d2 ; 12 rts ; 16 126 cycles This is using 68010 timings, with some assumptions. We see that, even counting the entire function call overhead, there is only a factor of 3.1 between the function call and the use of the hardware instruction directly. Things are perhaps not soo bad ! The sun compiler is also smart enough to recognise when a multiply by a constant is possible in a 16 bit instruction, and uses it rather than the function call in these cases. Finally, the 68020 has three sizes of multiply instructions, 16x16->32, 32x32->32, and 32x32->64; On this machine there is little penalty in having 32 bit ints, and the other advantages still apply. A compiler writer aware that any 16/32 bit decision for ints would apply across all 68k machines would probably not decide upon 16 bits just because some of the machines are slightly slower on one instruction, especially when all the machines would have to pay the penalty of maintaining the high bits in the registers if 16 bits were the decision. Neil/. PS: Try using: a = (int)((short)x * (short)y); if you really need that factor of 3 back in the multiply instruction -- On a Sun 2 this generates a `muls' instruction !