Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!ulysses!allegra!mit-eddie!husc6!think!ames!lll-lcc!pyramid!prls!mips!mash From: mash@mips.UUCP Newsgroups: comp.arch Subject: Re: RISC data alignment Message-ID: <1399@winchester.mips.COM> Date: 24 Jan 88 04:39:09 GMT References: <2635@calmasd.GE.COM> Reply-To: mash@winchester.UUCP (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 85 Posted: Sat Jan 23 23:39:09 1988 In article <2635@calmasd.GE.COM> gjo@calmasd.UUCP (Glenn Olander) writes: >Please forgive a possibly neophyte-type question, but is it true that >there may be an inherent incompatibility between RISC and conventional >machines? In particular, I believe that many RISC machines require data >to be aligned on a natural boundary, e.g. longwords must be referenced >on a 4-byte boundary. This requires compilers to make accomodations to >ensure that such alignment always occurs, even if it means padding a >data structure which contains mixed types of data.... >If this is true, then it would seem to also be true that a C structure >could have different lengths, depending on whether it was compiled >on a RISC or non-RISC machine. Further, it would seem that >if that C structure were written out to a file, it could only be read >properly by a machine of the same type as that which wrote it. >Does such incompatibilty truly exist? If I create a file on a Sun/4 >will I be able to read it on a Sun/3? This issue is not inherent to RISC versus non-RISC machines. Here are the implementation choices: Machine: M1: require alignment for every object SPARC, 78000, most other RISCs WE32xxx, Clipper M1A: alignment required, except practical unaligned code MIPS R2000 (odd-case - see later) M2: no alignment required IBM 370, VAX, MC68020, NSC32xxx, Intel 80X86 M3: alignment required for some, but not others MC68000 (16-bit on 16-bit boundary, 32 on 16-bit also) C language choices: C1: align every piece of data on its natural boundary, padding as necessary, including padding the size of a structure to the maximum alignment requirement of anything found in that structure [this makes arrays of structures work]. C2: Align some things, but not others (there might have been a C3: align nothing, but I couldn't think of a C compiler that does it) It's fairly clear that: 1) C1 (align everything) is the safest choice. 2) All of the M1 machines want C1 alignment Here's the matrix, with some examples: M1 M2 M3 C1 All M1s VAX 4.3BSD R2000 68020-? 68010-? C2 - Sun-68020 Sun-68010 Some vendors (Convergent, at least, unless memory fails me), implemented C1 even on 68010s that didn't require it. Sun aligned 32s on 16s on 68000s and stayed that way on 68020s, but Sun-4s indeed require 32 on 32. Thus, structures exist that do not have the same mapping between Sun-3s and Sun-4s. Fortunately, most C compilers chose C1 alignment, and most structures were often padded by hand by people who knew that unaligned things cost cycles on almost any machine. Case M1A is a little odd: the MIPS R2000 requires alignment of things on natural boundaries; however, "unaligned" load/store instruction pairs are provided, which can be used either from assembler, or (soon) with compiler switches to make the compielr use them when alignment is in doubt. Thus, loading something where you're not sure takes the 2 cycles that are required on most systems for something that is truly unaligned, and storing is likewise. Besides string-manipulation, and some COBOL-related things, a major motivation of of this is: The more serious issue here, actually, is there is a class of program that is excruciatingly difficult to port for pure M1-class machines. Specifically, some large FORTRAN programs (as old CAD systems), often in the half-million-line-plus category contain COMMON+EQUIVALENCE combinations THAT CANNOT EVER BE ALIGNED UNTIL THE CODE IS HEAVILY REWORKED. In particular, IBM-derived programs with INTEGER*2 in them can be nasty. For some reason, some owners have such code (thru which the armies have marched thru the years) have proved to be less than excited about porting it... -- -john mashey DISCLAIMER: UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086