Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!lll-winken!uunet!auspex!guy From: guy@auspex.UUCP (Guy Harris) Newsgroups: comp.arch Subject: Re: Unaligned Accesses (was Re: How to use silicon) Message-ID: <1334@auspex.UUCP> Date: 30 Mar 89 22:47:46 GMT References: <844@bnr-rsc.UUCP> <150@mirsa.inria.fr> Reply-To: guy@auspex.UUCP (Guy Harris) Organization: Auspex Systems, Santa Clara Lines: 56 >From experience, I can say that not having to bother with alignment can >make networking code much faster. Messages exchanged over nets include all >sort of variable size components, and padding is both frowned upon and >somewhat ineffective -- often, one cannot predict where a structure will >start. Well, for TCP, UDP, and IP headers, at least, items are aligned on their "natural" boundaries. Not *all* protocols start stuff on random byte boundaries.... >Thus, being able to simply define: > >#define integer_value(mess,x) (*(int *)(mess + x)) > >rather than: > >#define integer_value(mess,x) \ > (((((mess[x]<<8)|mess[x+1])<<8)|mess[x+2]<<8)|mess[x+3]) > >can speed up a lot the decoding of these messages. Umm, the latter appears to have the advantage that it works regardless of the byte order of your processor; messages exchanged over nets are often exchanged between big-endian and little-endian processors, so you can't necessarily just point at some arbitrary location in your message and suck up an "int", even if your processor *does* support unaligned references. >Indeed, one can object the relative frequency of network operations >vs computing, but the same holds for compression programs, Well, "compress" (a common Lempel-Ziv compression program) works on bytes, or on bit strings; the former don't have alignment problems, and the latter have alignment problems that the sort of unaligned references support that is being talked about here won't fix.... (The original "compress" had VAX bit-field instructions jammed into it; it now has that controlled by "#ifdef vax" - has anybody bothered using other processor's bit-field instructions, and has it actually made any difference?) >disk accesses, Huh? "Disk accesses", as I think of them, are usually to sectors, typically of 512 bytes. As for data *on* disk, well: 1) if you write out structures in the native format, they will presumably be aligned properly; 2) if you write them out in the native format for one machine, and read them in on a different machine, you have problems other than alignment, such as big-endian vs. little-endian format - see the comments above on the networking case; 3) if you write them out in some "common" format (e.g., XDR, or ASN.1), you're already doing processing work, so you don't just point some structure pointer at the data and go.