Path: utzoo!attcan!utgpu!utstat!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!mcvax!hp4nl!uva!croes From: croes@uva.UUCP (Felix A. Croes) Newsgroups: comp.os.minix Subject: About the Minix C compiler Summary: need something better Keywords: cpp cem opt cg as ld cv Message-ID: <725@uva.UUCP> Date: 8 Jun 89 08:57:00 GMT References: <597@lzaz.ATT.COM> <717@uva.UUCP> Sender: news@uva.UUCP Reply-To: croes@uva.UUCP (Felix A. Croes) Organization: The Courts of Chaos Lines: 111 Mr. Tanenbaum designed Minix as an teaching operating system. For this reason Minix includes the source of the OS and of most of the programs. However, from my point of view Minix has one major drawback: not everything is in source. In Minix-ST, the compiler, the assembler, the loader and the archiver are just binaries. The whole idea of Minix is that you can look at the source, maybe modify it, and hopefully learn from it. Strangely, this doesn't seem to apply to the compiler. This results in another problem: for the rest of the programs, enhancements and bugfixes find there way through the Usenet. But not for the compiler, that seems to be one of the most buggy programs present in Minix. For example, do you think the following program works? # define qwerty(a, b) (a-b) # define asd qwerty main() { printf("%d\n", asd (1, 2) < 0x1); } No it doesn't. First, the compiler treats octal and hexadecimal constants as unsigned ints. Following the C conversion rules, asd(1, 2) == -1 will be converted to an unsigned before comparision, thus becomes (unsigned) 0xFFFF and printf prints a 0. This is not K&R C and certainly not ANSI C. The -O option doesn't affect this. By the way, the -O option doesn't seem to affect anything. Second, this will not even make it through the C preprocessor which generates garbage on the macro "asd" (cem does the same). These are only minor bugs, but typical for the state the compiler is in. The above problems apply to Minix-ST but considering the many desparate attempts to use a Turbo C cross compiler on the PC, the compiler there cannot be much either. I would be prepared to overlook the fact that the compiler is slow and produces poor code, if at least it WORKED. On the ST, cem-opt-cg are full of bugs, the assembler changes "move.l #2, d0" in "moveq #2, d0" without warning, ld seems to work but cv produced incorrect code in the old version, and doesn't accept correct (well, assembler output anyway) input in the new version. Besides, cv should have been merged with ld in the first place. What can be done about it? - Wait until patches are send. This still leaves us without the source. And afterwards we have to wait again for the next patches. - Wait until Mr. Tanenbaum writes a book called "Compiler construction with Minix", sold together with full sources for a compiler (for an incredably low price of cource). Probably just wishful thinking. (?) - Buy the sources. But then you still won't be able to distribute them. - Write, or port, your own (public domain) compiler. To me, the last approach looks most promising. Several compilers have been ported to Minix, most notably GNU CC. However, GCC is not really an alternative as it is not compatible with ACK and so enormously large that, with 1 Mb of memory, it hardly fits and certainly cannot compile itself. Besides, this is no solution for those poor PC guys. A good compiler should - be available for both PC and ST Minix - be almost identical on those two machines - except for the code generator and the assembler - be ACK compatible (at least use the same object files) - be small - be reasonably fast - produce correct, maybe even good, code. I understand that the Minix-PC compiler can only produce programs with a maximum size of 64K (text+data). An option to use 64K text and 64K data on the PC should be included. As for the ST, how about this: It IS possible to create real position independent code on the 68000. Make jsr/jmp relative to a4 (or pc), replace rts by move.l (sp)+, d1 jmp 0(a4, d1.l) Also make all data references relative to a5. If you consider the frame pointer as pointing to a linked list of frame pointers, it is easy to see that all these frame pointer values can have the right offset added when the data space is moved in memory, so even link and unlk present no problem. The disadvantages are slower and larger code, no more memory faults on reference of NULL pointers, and some modifications to the kernel are needed. The advantages include - PIC (you can even use swapping) - less registers left, but all registers can hold pointers now - as an extra option, 2 bytes pointers are possible: this would make sure that all references would be limited to a 64K (+ 2 bytes) data segment, so there is no way to crash the operating system by destroying important data outside a process' address space. Of course this also means that the text segment must be limited to 64K - better fork(): no shadowing. I am not saying that the above solution should replace the present one, but it would be a nice alternative and it would mean more PC compatibility. Perhaps SEPARATE_ID programs could be in this format? SUMMARY: if comp.os.minix is to be split up, it should become comp.os.minix and comp.os.minix.compiler. As soon as I have time, I will write a ld that replaces the old ld and cv for the ST - I will try to keep it PC-compatible. Felix Croes +---------------------------------------|--------------------------------------+ | "GEM is dead - | croes@uva.uucp | | long live Minix!" | ...!mcvax!uva!croes | +---------------------------------------|--------------------------------------+