Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!sun-barr!decwrl!hplabs!hpda!hpcuhb!hpcllla!hpclisp!hpclmar!mar From: mar@hpclmar.HP.COM (Michelle Ruscetta) Newsgroups: comp.sys.hp Subject: Re: 9000/835 loader and assembler problems Message-ID: <1340056@hpclmar.HP.COM> Date: 8 Jun 89 00:03:41 GMT References: <14870@comp.vuw.ac.nz> Organization: Hewlett-Packard Calif. Language Lab Lines: 219 > I've hit a small problem trying to port KCl to our 835. > > KCl uses dynamic loading [ld -A] to load its object files into > memory. However, it has appended some text to the object file which > it loads separately. All the lds that I have seen before, allow extra > rubbish on the end of object files, but the 835 loader says > > /bin/ld: foo.o: Not a valid object file (invalid system id) [ correctly answered in previous response ] > > Another problem I am having, is what to do with the object file after > I have loaded it. I read the object module's header to determine how > much space I should allocate in memory. I allocate the space, and > pass the starting address to "ld -A bar -R %x -o baz". The object file > that I get back has a number of interesting properties > > - the starting address of the text segment has been rounded up to a > page boundary -- is there anything in the architecture that > requires this? > - the starting address of the data segment has also been rounded up > to a page boundary. Again is there any real reason for this? > Yes, the HPUX loader requires page alignment of both the text and data segments. This is primarily because memory protection is done on a page basis. Even though you will essentially be 'loading' your own code, this alignment is still performed. > > - I want to branch to the first routine in the file. The > inter-space stub seems to be at TEXT+4 (*TEXT is a break). Is > there a better way to find this? > You MUST use the "exec_entry" field in the HPUX auxiliary header (which immediately follows the standard file header), or use the entry_offset field in the file header. > - the header gives a different size than the one I worked out > earlier [Surprise, surprise]. Any suggestions on a better size > predictor (I am currently using size+PAGESIZE). > Sorry, no good size predictor -- it is very difficult to determine the size of an a.out file, given a relocatable object file, unless you know thatthe a.out doesn't include any code from other objects. > I guess I should write my own linker :-( > Good luck! -- The linker for the series 800 is much more complex than the linkers I have seen for other CISC architecures -- due to some RISCY requirements. There are some other things that complicate dynamic linking on the s800 architecture: 1) HP-UX on the s800 still does not support non-sharable, writable text, so dynamically-loaded code must be placed in the data space. This means that inter-space "stubs" must be created in order to support brancheinh between the code and the data space (This is because the standard procedure call and return sequence cannot branch across spaces). 2) The process of "stack unwinding" cannot handle dynamically-loaded code, so getting a stack trace from a debugger will be impossible when executing within the dynamically loaded code -- this is also why the Pascal try/recover (escape()) feature will not work. 3) Address relocation is complicated by the instruction format, which is not a typical " add a constant to a full word" type of patching (In fact, for fun take a look at the a.out manual page to see what the fixup formats look like). 4) You have to be careful about flushing the instruction/data caches (due to #1 above), before executing the code that has been 'loaded' into memory. Below, I have an example of a program which uses dynamic linking, this might give you some help/insight as to what's involved with dynamic linking on the series 800, using the ld -A option. The -A option was implemented in the s800, HPUX 3.0 release. The -A option is used when you want to dynamically link a file from an existing 'main' program. The link command is called from within the main program (using 'system()' or 'exec()'), using the main program as the basefile (ld -A basefile ...) so that any symbols defined in the basefile will be used to resolve references from the file which is being dynamically linked (for example if you want to make calls from a dynamically linked function to routines which are defined in the main program). Normally, space is allocated in the main program's data area using malloc(), but since you don't know the size of the executable file that you will be placing into the data area, the malloc size is just a guess. The address returned from malloc must be page-aligned, and then can be used in the link (ld -A basefile -R data_address ...) command to inform the linker to link the file using that address for code placement. The link command should also sppecify the -N option to tell the linker to place the data immediately following the code, since we want code and data to be contiguous when we read it into the main program's data area. The executable file resulting from the link can then be read into the space allocated using information from the HPUX auxiliary header record, such as size of text, the file location of the program entry point, and the size of data. The execuatble file is read into data, and then can be executed by dereferencing a function pointer which has been set to the address of the entry point (found in the HPUX auliary header). There are other details to be taken care of as well, such as doing a memset for BSS (to initialize all of bss to zero), since the loader (exec()) usually does that for you, and we are bypassing the loader. Basic steps: (Note: this is not necesarily a complete nor syntactically correct C program but serves for illustration only): main() { char *x; int (*funcptr)(); x = malloc(some_large_size); /* page align since ld expects page-align value for -R */ page_align(x); /* get the value of 'x' into the ld command that we are going to call */ sprintf(cmd_buf, "ld -A basefile -R %x -N dynfunc.o -o dynfunc -e foo",x); /* call the linker to link the file */ system(cmd_buf); /* now we open the resulting executable for reading */ fileptr = fopen("dynfunc", "r"); /* seek to and read the auxiliary header record fseek(fileptr, sizeof(struct header), 0); fread(&filhdr, sizeof(filhdr), 1, fileptr); /* determine the size of the executable -- and see if we allocated enough space */ dynfunc_size = filhdr.exec_dmem + filhdr.exec_bsize - filhdr.exec_tmem; if(dynfunc_size > some_large_size) { /* do something -- either error, or realloc and relink */ } /* seek to and read in the text area of the dynamically linked file */ fseek(f, filhdr.exec_tfile, 0); fread(filhdr.exec_tmem, filhdr.exec_tsize, 1, f); /* seek to and read in the data area of the dynamically linked file */ fseek(f, filhdr.exec_dfile, 0); fread(filhdr.exec_dmem, filhdr.exec_dsize, 1, f); /* init the BSS area to zero */ memset(filhdr.exec_dmem+filhdr.exec_dsize, 0, filhdr.exec_bsize); /* set the function ptr to the entry point of the dynamically linked file */ funcptr = (int (*)()) (filhdr.exec_entry); /* flush the data and instruction caches -- not this must be done on the series 800 ! -- see the flush_cache assembly routine below */ flush_cache(); /* call the dynamically linked function */ (* funcptr)(); } /* END OF PROGRAM */ The following is the routine that can be used to flush the caches courtesy of Cary Coutant: ; ; Routine to flush and synchronize data and instruction caches ; for dynamic loading ; ; Copyright Hewlett-Packard Co. 1985 ; .code ; flush_cache(addr, len) - executes FDC and FIC instructions for every cache ; line in the text region given by the starting address in arg0 and ; the length in arg1. When done, it executes a SYNC instruction and ; the seven NOPs required to assure that the cache has been flushed. ; ; Assumption: the cache line size must be at least 16 bytes. .proc .callinfo .export flush_cache,entry flush_cache .enter ldsid (0,%arg0),%r1 mtsp %r1,%sr0 ldo -1(%arg1),%arg1 fdc %arg1(0,%arg0) loop fic %arg1(%sr0,%arg0) addib,>,n -16,%arg1,loop ; decrement by cache line size fdc %arg1(0,%arg0) ; flush first word at addr, to handle arbitrary cache line boundary fdc 0(0,%arg0) fic 0(%sr0,%arg0) sync nop nop nop nop nop nop nop .leave .procend .end