Path: utzoo!attcan!uunet!munnari!mimir!hugin!augean!idall From: idall@augean.OZ (Ian Dall) Newsgroups: comp.emacs Subject: Re: Portability problem with gnu-emacs Message-ID: <401@augean.OZ> Date: 29 Sep 88 03:02:00 GMT References: <441@myab.se> <29698@bbn.COM> <1231@xyzzy.UUCP> Reply-To: idall@augean.OZ (Ian Dall) Organization: Engineering Faculty, University of Adelaide, Australia Lines: 104 In article <1231@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes: >> jr@bbn.com (John Robinson) >>> lars@myab (Lars Pensj|) >>>[...GNU emacs could be more portable if it arranged to initialize its >>> pre-defined lisp routines via source like so: ...] >>>char lisp_code[] = { >>>23, 45, 76, 93, -34, 45, >>>... >>>}; >>>... etc. >> But the problem may be that not all compilers support [...putting >> such objects in the text (that is, shared) section...]. [...] >> Also, signed chars >> (they appear in your example) may be a problem. But I merely quibble; >> it's a great idea. Needs elisp symbol-table hooking but not much more. > >I agree that the method Lars points out is more portable, and good >deal cleaner, even with the problems with signed chars and the fact >that it may be unshared data on some systems. > >BUT, the real problem solved by unexec that is not solved by source >generation is that some of the values that go into the initialized >area are not known until after link time. The addresses of primitive >routines, for example. As long as the lisp object code refers to >absolute addresses (and I suspect it must do so for efficency >reasons), the initializing code cannot be generated for an object yet >to be linked, but only for the current *already* *linked* executable. >Which implies unexec, or some similar subtrefuge. Most LISP systems >have similar problems. Well, if the "loaded-lisp.c" is last in the list of things linked it would be OK on most machines. It still wouldn't be portable to machines which linked things in funny orders. My earlier suggestion of turning the lisp into real C instead of just a large initialised array would not have this problem, but can it be made to work? After all don't most "real" lisps can produce compiled code? >All that said, I think unexec could be made a good deal cleaner, and >the machine dependancies could be isolated in a much more palatable >way. Gnu emacs makes several non-portable assumtions. Those that spring to mind are: (1) Pointers (to lisp objects) are stored in 24 bits. This means that machines which are capable of, AND USE, a virtual address space of more that 2^24 won't run Gnu emacs. This is pretty much independent of the unexec feature. (2) ld is assumed to load the concatenated .text sections followed by the concatenated .data sections. This allows unexec to work out the beginning and end of the sections and also to guarantee that the pure data is at the beginning of the .data section. (3) Various kernels make different assumptions about the alignment of the .text and .data in an executable file, presumably to simplify the paging process. Emacs must guess what these assumptions are when creating the unexeced emacs. (4) Emacs assumes that C static variables go in .bss if uninitialised and in .data if initialised. In fact it uses this as a way of forcing which variables end up where. I know of one compiler which treats uninitialised static variables as if they were initialised with zero (and sticks them in .data). (5) Emacs needs to be able to read its own .text section. Some systems could prevent this if the MMU differentiates between read protection and execute protection. Systems with different instruction and data spaces would be a problem (not that GNU Emacs would run on a PDP-11 anyway). Assumptions 2 and 4 could disappear if unexec did not attempt to put data into the .text region. An extra conditional might be useful to say don't try to adjust the .text/.data boundary when unexecing. This has a penalty in that the pure lisp will not be shared but at least the speed up in start up time will still be there. Lars solution would also result in non-shared .data sections at least on BSD machines, and any attempt to fix this drawback would probably have to make the same sort of assumtions as unexec. Perhaps unix could use a .pdata (pure data) section type. The file format problems alluded to in 4 are more than just an Emacs problem, they are a unix problem. The COFF file format for SysV is a step in the right direction, but there are, unfortunatly some system dependent magic numbers defining the alignment of the sections, which vary from system to system. If these were defined in some standard include file things might be more palatable. This information is needed by any program development tools which create object files. One way out of this would be for unexec to create a dirty big assembler file (consisting entirely of allocation directives) and use the existing assembler and loader to create the executable file. Of course the assembler is not exactly portable! I don't think that there is much that could be done about 5. Is it a problem? Disclaimer: I haven't delved into this since version 17 but I don't think things have changed significantly. -- Ian Dall life (n). A sexually transmitted disease which afflicts some people more severely than others. idall@augean.oz