Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!think.com!mintaka!spdcc!esegue!compilers-sender From: Chuck_Lins.SIAC_QMAIL@gateway.qm.apple.com (Chuck Lins) Newsgroups: comp.compilers Subject: Time to Write a Compiler (Oberon) Keywords: Oberon Message-ID: <9011131737.AA13263@internal.apple.com> Date: 13 Nov 90 14:39:31 GMT Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: Chuck Lins Organization: Compilers Central Lines: 109 Approved: compilers@iecc.cambridge.ma.us I have been working on an Object Oberon compiler in my spare time. While Object Oberon is a much simplier language than Fortran90, I have kept track of the hours invested in this project. First some background and a brief status report. Background ---------- The front-end and the shell for a back-end was obtained from ETH Zurich at the cost of 1000 Swiss francs. It contained a portable scanner, lexical analyzer, parser, and abstract syntax tree builder. The front-end covers the language Oberon. Object Oberon is a superset of Oberon. The back-end shell consists of an abstract syntax tree traversal module and stubs for calls to a lower-level code generator. The job of the compiler writer is to fill in these stubs as well as adding a machine dependent low-level code generator. The compiler itself is written in Oberon. The target processor is the Motorola MC68000 family. Target hardware is the Macintosh (tm). Target programming environment is MPW (Macintosh Programmers Workshop). Status ------ I started with a copy of the MacOberon system from ETH. This is a standalone application running on the Macintosh. It contains an Oberon compiler in object form. Thus the first task was to write a cross-compiler running under MacOberon but producing object code for MPW. Once this was done, I created another copy of the compiler source code (based on the cross-compiler). This version was modified to run under MPW and produce code for MPW. The main task here was replacing calls to the Oberon environment/libraries with calls to semantically equivalent or similar MPW libraries. As of today (11/12/90) I have completed work on the cross-compiler and have a working compiler running under MPW. The compiler compiles itself under MPW. How Long it Took ---------------- To date, I have spent 399.5 hours on both the cross-compiler and the native compiler. I started on July 24, 1990. After 286 hours of work, on Oct 15 I got the native compiler to link successfully. After another 44 hours of debugging I was able to start using the native compiler to compile itself. 35 more hours of debugging and the native compiler was able to compile itself. Since then I've been using the native compiler to add enhancements to itself as well as improve its code generation. What Gave Me Problems --------------------- It took me quite a while to get the comparison and branching instructions correct. Mapping the source language operands onto the hardware instructions was very time-consuming. I remember spending quite a bit of my time in MacsBug (the machine level debugger) single-stepping machine instructions from a known 'good' point in the compiler. You had to do this when you jumped off into space somewhere. I had one nasty bug for referencing VAR parameters. Unfortunately, you'd sometimes overwrite part of the return address (or destroy a pointer) and jump into the twilight zone. The good part was that I found four other bugs while searching for this one. Another source of errors was my failure to check for special cases of variable addressing modes. Specifically things like VAR-parameters and accessing parameters at outer nesting levels. What Went Well -------------- When I did the initial (paper) design, I reduced the 68000 addressing modes to four categories: immediate operand, register value, memory operand, and condition code. Then I created a table for each AST node (eg integer ADD, logical AND, procedure CALL, etc). Each AST node has (at most) two operands and one result operand. I then made N rows covering all the operands as input and output values. I was influenced here by Horspool's ideas given in "An Alternative to the Graham-Glanville Code-Generation Method", IEEE Software, May 1987. An example table for integer addition would be as follows (where op = operator, res = result, l = left subtree, r = right subtree, R = register, O = operand, I = immediate, and CC = condition code): op res l r + R R O + R O O + R R R + R O R + R R I + R I R etc. Each row was then filled in with the assembly language source statements necessary to produce the proper result. This was where I forgot the special cases. The generic operand sometimes needed object code to produce a simplier addressing mode. It was easy to fix once I recognised the problem (even though I had all the code written for the code generator). I think using trees worked out well. It was relatively easy to generate correct code from them. It may not be the most highly optimized object code, but it's not terrible. At present, the compiler compiles the largest of its source modules (63K) in about 75 seconds. Improvements remain in the area of array indexing on parameters. My current code involves a multiply. I can take advantage of type information to eliminate most of these. This should yield a good performance improvement. Now I can concentrate on optimizations, adding the object oriented extensions, and the other capabilities necessary for a production quality compiler. Chuck Lins lins@apple.com -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!esegue!compilers. Meta-mail to compilers-request.