Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!tektronix!orca!tekecs!frip!andrew From: andrew@frip.wv.tek.com (Andrew Klossner) Newsgroups: comp.arch Subject: RISC vs unaligned data Message-ID: <11222@tekecs.GWD.TEK.COM> Date: 31 Mar 89 18:47:49 GMT References: <355@bnr-fos.UUCP> <13@microsoft.UUCP> <16058@cup.portal.com> <370@bnr-fos.UUCP> Sender: andrew@tekecs.GWD.TEK.COM Organization: Tektronix, Wilsonville, Oregon Lines: 45 [] "the historical trend is to be progressively more tolerant of misalignment, e.g. IBM /360 /370, Motorola 68K families. All the "tolerant" machines always attach a *penalty* to misalignment. It is only the very recent crop of so-called RISC chips that is requiring alignment again." Many contributors to this discussion seem to hold the opinion that, if alignment isn't supported by hardware, it isn't supported at all. But one of the points of RISC is to move complexity from hardware to software. Why not just let the compiler do it? If the compiler knows the alignment of a word (the low two bits of the address are a compile-time constant, as for an unaligned word within an aligned structure), it can do a (slightly) better job than if it is totally clueless about the runtime address. PL/I provided the "UNALIGNED" specifier to advantage on the 360/370 machines. A system supplier willing to extend their C language could add a similar construct to C. For example, on the 88k, an architecture that doesn't have particularly good support for unaligned data, the compiler might generate code like this to fetch a word from an address that it knows will be odd: ; address of unaligned word to fetch is in r10 ld.bu r1,r10,0 ld.hu r2,r10,1 ld.bu r3,r10,3 mak r1,r1,8<24> mak r2,r2,16<8> or r1,r1,r2 or r1,r1,r3 ; word is in r1 If the word is in the data cache, this takes seven cycles and wastes two scratch registers (r2 and r3). (The code to fetch from an even but unaligned address takes five cycles.) With hardware support it could do a better job ... but is it necessary to fetch an unaligned word in fewer than seven cycles? That fetch takes fewer nanoseconds than it does on the modern, unalignment-forgiving CISC machine that I'm typing this on, which after all is the bottom line in RISC vs CISC. -=- Andrew Klossner (uunet!tektronix!orca!frip!andrew) [UUCP] (andrew%frip.wv.tek.com@relay.cs.net) [ARPA]