Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!lll-winken!uunet!tektronix!reed!mdr From: mdr@reed.UUCP (Mike Rutenberg) Newsgroups: comp.arch Subject: Re: RISC vs unaligned data Message-ID: <12264@reed.UUCP> Date: 31 Mar 89 23:19:54 GMT References: <355@bnr-fos.UUCP> <13@microsoft.UUCP> <16058@cup.portal.com> <370@bnr-fos.UUCP> <11222@tekecs.GWD.TEK.COM> Reply-To: mdr@reed.UUCP (Mike Rutenberg) Organization: Reed College, Portland OR Lines: 31 Andrew Klossner writes: >For example, on the 88k, an architecture that doesn't have particularly >good support for unaligned data, the compiler might generate code like >this to fetch a word from an address that it knows will be odd: [code example] >If the word is in the data cache, this takes seven cycles and wastes >two scratch registers (r2 and r3). (The code to fetch from an even but >unaligned address takes five cycles.) With hardware support it could >do a better job ... but is it necessary to fetch an unaligned word in >fewer than seven cycles? That fetch takes fewer nanoseconds than it >does on the modern, unalignment-forgiving CISC machine that I'm typing >this on, which after all is the bottom line in RISC vs CISC. The main problem is that the 7 instruction sequence you gave takes up i-cache space and as you indicated needs registers. If the processor can deal with unaligned accesses, with the clearly associated performance hit it implies over aligned data, you may get better icache performance. This becomes a bigger deal with larger programs and too much unaligned data. The Intel 80960KA has a nice memory interface that is fast and allows unaligned data references. Among things which assist in unaligned accesses is a 3*(1-8 byte) fifo for outstanding memory write requests (a similar fifo is for read requests). Mike -- Mike Rutenberg Reed College, Portland Oregon (503)239-4434 (home) BITNET: mdr@reed.bitnet UUCP: uunet!tektronix!reed!mdr Note: These are personal remarks and represent no known organization --mdr