Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!husc6!hao!noao!mcdsun!fnf From: fnf@mcdsun.UUCP Newsgroups: comp.lang.c Subject: Re: Types Message-ID: <366@mcdsun.UUCP> Date: Wed, 2-Sep-87 12:36:35 EDT Article-I.D.: mcdsun.366 Posted: Wed Sep 2 12:36:35 1987 Date-Received: Fri, 4-Sep-87 04:09:34 EDT References: <7264@brl-adm.ARPA> <734@sdchema.sdchem.UUCP> <293@osupyr.UUCP> <364@mcdsun.UUCP> <320@swlabs.UUCP> Reply-To: fnf@mcdsun.UUCP (Fred Fish) Organization: Motorola Microcomputer Division Lines: 116 In article <320@swlabs.UUCP> jack@swlabs.UUCP (Jack Bonn) writes: >In article <364@mcdsun.UUCP>, fnf@mcdsun.UUCP (Fred Fish) writes: >> This only works for jumps to local symbols within the same assembly module, >> and GREATLY complicates the work of a linker which optimizes both local >> and global references (since it must now cope with relocation done by >> the assembler when it inserts or deletes code between the definition and >> reference of a local symbol). > >The algorithm presented requires no such optimization by the linker. The >assembler described yields object code with the correct long/short jumps and >no additional relocation/fixup information. > >There is nothing for the linker to fixup. Sigh, guess we are going to have to go into the gory details. Assume a hypothetical machine with two forms of a jump instruction. One is is 32 bits and is evenly split with 16 bits being the instruction op code and the other 16 bits being a PC relative offset. The other jump instruction is two long words, each 32 bits, with the first being a different 16 bit opcode with the other 16 bits unused, and the second being a 32 bit PC relative offset. I.E.: jump.near <16 bit opcode><16 bit offset> jump.far <16 bit opcode>< unused > < 32 bit offset > Now assume some assembly code such as the following (note the lack of near/far specifications): . . 0x00000100 jump L1 #L1 is local . . 0x00000200 jump _strcpy #_strcpy is defined elsewhere . . 0x00000300 L1 jump _exit #_exit is defined elsewhere . . ^ | Offset from start of module The assembler assumes all jumps are near jumps and simply emits object code that looks something like: . 0x00000100 <16 bits of 0> . . 0x00000200 <16 bits of 0> . . 0x00000300 L1 <16 bits of 0> Now along comes the linker. It gathers up our object module, finds the reference to external routine _strcpy, and loads the module from the C library that defines _strcpy (along with lots of other stuff at the same time). Assume the definition of _strcpy in another module is located in the executable 0x00100000 bytes from the jump instruction above that uses it (heh, this is GNU Emacs :-). Now obviously 0x00100000 won't fit in 16 bits, so the linker must modify the object code to look something like: . 0x00001100 <16 bits of 0> . . 0x00001200 <16 bits of 0> #modified opcode 0x00001204 < 32 bit offset > #inserted field . . 0x00001304 L1 <16 bit opcode><16 bits of 0> . . 0x00101200 ^ | Offset from start of executable module After relocation resolution, the final object code looks like: 0x00001100 <0x0204> . . 0x00001200 <0x0000> #modified opcode 0x00001204 < 0x00100000 > #inserted field . . 0x00001304 L1 <0x1234> . . 0x00101200 If the assembler had decided that since the jump to L1 was a local jump of "known" offset 0x200, stuck 0x200 in the offset field, and thrown away the relocation information, the linker would not have been able to go back and fix up the 0x200 to 0x204 without scanning all the executable code looking for references that spanned the insertion point (at 0x1200-0x1204). This would have essentially required you to disassemble the entire load module EVERY TIME you promoted a jump.near instruction to a jump.far instruction! Hope this helps to clarify why the assembler SHOULD NOT do any relocation. -Fred -- = Drug tests; just say *NO*! = Fred Fish Motorola Computer Division, 3013 S 52nd St, Tempe, Az 85282 USA = seismo!noao!mcdsun!fnf (602) 438-3614