Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!think!harvard!seismo!munnari!goanna!rcodi
From: rcodi@goanna.OZ (Ian Donaldson)
Newsgroups: net.lang.f77,net.unix-wizards
Subject: Re: Any decent Fortrans under Unix ? Which machine ?
Message-ID: <208@goanna.OZ>
Date: Sun, 2-Mar-86 09:47:05 EST
Article-I.D.: goanna.208
Posted: Sun Mar  2 09:47:05 1986
Date-Received: Sat, 15-Mar-86 18:31:18 EST
References: <173@cybavax.UUCP> <206@goanna.OZ>
Distribution: net
Organization: Comp Sci, RMIT, Melbourne, Australia
Lines: 104
Xref: watmath net.lang.f77:493 net.unix-wizards:17201

> I forgot to mention that f77 effectively forces you to split your
> programs into many small files when you are on non-virtual machines.
....
> Mike Gigante

I agree with most of Mike's responses (its a wonder no-one else has
submitted such an article yet), but the last point re: f77 forces you
to split your prog into small files     - is true, but this is not a
BAD idea!

The whole concept of compilation under UNIX has been to allow the Make
utility to maintain large programs by arranging for minimal
recompilation.  C programs are also broken into managable segments of
source.  The speed of the {f77,c,pc} compiler is not THAT critical if
this philosophy is observed.  You tend not to do this as much with
Pascal as there is no universal standard way of doing separate
compilation (ps:  despite this, I am a Pascal freak).

It is still  nice to have a fast compiler, for the times you are
compiling someone else's code and don't want to spend too much time
porting it.

It would be nice if {ccom, f77pass1, pc0} all skipped the phases of
generating assembly language text, and just produced a load-file
straight from the source.  I can see little advantage in producing text
and then for the optimizer (c2) to parse it again, write it again, and
for the assembler (as) to parse it yet again to finally produce a
relocatable binary.  Poor pc users are stuck with the following large
number of passes in compiling programs:  cpp, pc0, pc1 (f1), pc2, c2,
pc3, as, ld.  C programmers have this:  cpp, ccom, c2, as, ld.  It
would be nice if it was just:  cpp, ccom, ld.  Even nicer if cpp was
built into the input parser of each of the compilers (but still have it
as a separate package too).  

A lot of the compilation time is spent doing unecessary disk i/o and
parsing.  I have heard that it has not been done this way due to the
complications of porting it to a new machine.  This argument does not
really hold, as there are large amounts machine specific pieces of code
in most of the passes anyway (ccom, pc, f77pass1, c2 and as all need to
be extensively modified when porting to new cpu's).  The only
advantages that I can see of producing text assembly source is that it
can be hacked by sed scripts to replace certain instruction sequences
by others that the compiler doesn't generate, or to peruse the
efficiency (or lack thereof) of the code.


I agree with Mike's comments re: efficency of code generated by the compiler,
though - its is far from excellent.  This SHOULD be improved substancially
without introducing extra compile time (this is probably not a job for the 
C optimizer either!)

I tend to disagree with the philosophy of just using registers R0 and
R1 for calculations, and assigning the rest for register variables.
ALL intermediate results in calculations should be cached in
registers.  If the compiler can figure out what your code is doing, it
might tend to lock SOME variables in registers to improve efficency
(eg:  for-loops that have constant range where the range is large, and
known at compile time).  I have looked at a lot of assembly code
produced by various ports of UNIX C compilers and noted that there is
little or no attempt to "carry over" intermediate results from C
statement to C statement.  The same goes for f77 and pc.

What does YOUR compiler generate for the following C code?

	a = z[i,j,k];
	b = z[i,j,k];

A good example of cost-effective local optimization can be found in the
RMIT CYBER Pascal 3 compiler (an improved version of Univ. of
Minnesota's V3 compiler courtesy R.S.V. Pascoe and Associates of RMIT
Dept of Computing).  The only thing that makes it unnecessarily slow is
the CYBER's architecture - it is not a byte-addressable machine - most
of the time is spent unpacking and packing words and characters.  Also,
there is no hardware stack, so one must be faked by "dedicating" certain
registers.  (I'm talking about th 60-bit emulation - the 64 bit
emulation I'm not sure about due to lack of readily available
documentaion on the instruction set -- and the general lack of ability
to do much at all under NOS/VE unless you are a Fortran or CYBIL
programmer - we are still waiting for a Pascal and C compiler from CDC)
(side question - why did CDC put NOS/VE on their 64 bit emulation,
rather than just porting UNIX to it, native?  UNIX under another O/S
would tend to leave a lot to be desired (such as speed).  )

I might add that RMIT/Minnesota Pascal compiler does all this in 1 pass
- source to relocatable binary!  COMPAS/Turbo Pascal users are very
pleased with this concept too.  C does not really require more than 1
major pass either if you write your compiler sensibly.

The CYBER is designed to crunch numbers, not characters and it
does this very well.  Pity a fair percentage of the (daytime) load on our 
CYBER is spent crunching characters by student compilations and packages.
A VAX 8650 (running 4.3bsd) that has a set of compilers that generate 
CYBER machine code, along with a (transparent) fast channel link to a 
CYBER would be nice.   Compile & edit on the VAX and execute on the CYBER,
giving rise to many of these:    :-)   :-)  :-)   :-)  .
- DEC & CDC are you listening?

Ian Donaldson,
Dept of Comm & Elec Eng,
Royal Melbourne Institute of Technology,
Melbourne, Australia.

VOICE:  (03) 660-2619  (midday <= BestTimeToCallMe < midnight)     :-)   
ACSnet:  rcodi@Unison6.oz