Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!caen!spool.mu.edu!snorkelwacker.mit.edu!ira.uka.de!ifistg!bs3!schoebel
From: schoebel@bs3.informatik.uni-stuttgart.de (Thomas Schoebel)
Newsgroups: comp.lang.modula2
Subject: Re: TopSpeed 3.0 First Impressions
Message-ID: <11745@ifi.informatik.uni-stuttgart.de>
Date: 26 Jun 91 11:02:18 GMT
Article-I.D.: ifi.11745
References: <9782@sail.LABS.TEK.COM> <11710@ifi.informatik.uni-stuttgart.de> <1991Jun25.172455.12384@leland.Stanford.EDU>
Sender: news@ifistg.uucp
Organization: Informatik, Uni Stuttgart, W. Germany
Lines: 73

In article <1991Jun25.172455.12384@leland.Stanford.EDU> dhinds@elaine18.Stanford.EDU (David Hinds) writes:
>This is quite a strong claim, and I'm not at all sure it is valid.  Many
>very serious compilers (the MIPS RISC compiler backend, for example) that
>are state of the art in optimization pass some parameters in registers.
>The MIPS compilers pass the first three or four word-sized parameters this
>way.  The push orgy you complain about would seem to be equivalent to the
>push orgy that would have been done if the parameters had to be put on the
>stack in the first place, right? 

You are right with respect to RISC architectures: There you have plenty
of registers, and also the stack can be achieved by register windowing.
However, TopSpeed is running at the 8086 where you have only 4 general
purpose 16bit registers. Index registers like SI and DI, and also
segment register could be used for temporary storage, but the 86 architecture
is not very symmetric: Some instructions don't work with all registers
or they are hard-assiged to particular registers. This results in
frequent use of registers for temporary values. An example: If a procedure
has three parameters, where two of them are pointers resp. VAR parameters,
you'd need 5 registers for them. Unfortunately, AX is always choosen
for the first parameter, but AX is (most likely) the most frequently
used register. So the mentioned push orgies are very likely.

>You also over-generalize about properties
>of procedures in large systems.  Can you show some evidence of this claim,
>such as, that the most time-critical procedures in "large systems" tend to
>both have many parameters and also do relatively little work (if they do
>a lot of work, the overhead of passing parameters one way or another is
>insignificant)?

Well, my project is a multi-thread one. At least for this case my claims
are most likely, because you have to avoid static data where possible.
Thus nearly all structures are referenced via pointers, which have to
be passed as parameters.
Of course, if you program FORTRAN or COBOL style, you need not use
parameters, but then there would be no difference between both
calling conventions.

>I guess you could argue that the 80x86 machines don't have enough registers
>to justify using any of them to pass parameters, even transiently.  Or that
>the JPI compiler isn't strong enough at optimization to minimize the overhead.

Yes, the problem is the available number of registers: They are only useful
for parameters if you don't have to much of them.
Secondly: Whenever procedure A calls B where B's parameters are passed
via registers, A has to preserve the old values if they are needed after
the call. Even if procedure B preserves all temporary used registers,
there will be some overhead somewhere. The point is that there is no
general measure for predicting the lifetime of values. Keeping less
used but lately referenced values in registers leads to preserving
overhead. The question is that nobody can predict from a procedure
definition (e.g. in a DEFINITION MODULE) the reference pattern for
the parameters. Allocating registers implies an assumption that
parameters will be used either frequently or can be discarded after
a few statements. But what if the parameter is used at the bottom of
the procedure, of if the register (e. g. AX) is used in a prior
function call for the return value?

>But you should have some timing data before you make this claim (based only
>on program size, as far as I can tell).
>
> -David Hinds
>  dhinds@cb-iris.stanford.edu

Yes, I'm currently doing some timing measurements, but there are limits
because of the file accesses of my program which aren't reproducable
in general.
In a previous article, George Thiruvathukal also suggests a proof for
my claims. I think a mathematical proof is not very hard, for there
are parallels to thrashing effects in paging strategies in operating
systems. The only problem is to "prove" that having only 4 or, say 8
registers makes it very likely that such thrashing may occur.

-- Thomas