Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2(pesnta.1.2) 9/5/84; site scc.UUCP
Path: utzoo!watmath!clyde!cbosgd!ihnp4!zehntel!dual!amd!pesnta!scc!ted
From: ted@scc.UUCP (Ted Goldstein)
Newsgroups: net.lang
Subject: Re: Re: Optimization technique wanted...
Message-ID: <258@scc.UUCP>
Date: Tue, 27-Nov-84 01:01:11 EST
Article-I.D.: scc.258
Posted: Tue Nov 27 01:01:11 1984
Date-Received: Wed, 28-Nov-84 05:17:43 EST
References: <438@ima.UUCP>
Organization: Santa Cruz Computer, Inc, Aptos, Calif.
Lines: 48

> Long vs. short branch optimization is an interesting problem.  It turns out
> that generating perfectly optimal branches is NP-complete, i.e. there's no
> known method that's better than enumerating the possibilities, although it's
> possible that there is one that we haven't found one yet.  (If you find one,
> please let me know.  You could even call collect.)
> 
> Fortunately, there are practical nearly-optimal schemes for real programs.
> One was written up in CACM a few years ago.  In general, you start by assuming
> that all of your branches might have to be long, although you pretend
> that they're all short and assign tentative addresses that way.  Then you
> interate over a table of all of the branch instructions that you have to make
> a decision about, marking those that can definitely be short, even if all of
> the undecided branches in their span turned out to be long.  When you find
> that you can't mark any more, the unmarked ones are long and the marked ones
> are short.
> 
> The Vax Unix assembler does this, and it usually turns out that 2 or 3 passes
> over the table of branches do the trick.  This approach sometimes misses a
> marginal case, but that's better than overoptimizing and generating bad
> code.  I recommend the CACM article if you want to understand it better.
> 
> John Levine, ima!johnl


Well, it depends on the architecture. On the 8086 and family, where jumps
are relative  8 or 16 bit branch to the Program Counter register, we
only have to keep the previous 256 bytes worth of instructions in a ring
buffer.
A jump is first considered to be a long jump. If its target appears before
the jump leaves the ring buffer, it collapses to a short jump.
The test for collapsing it occurs only on entry and on exit to and from
the ring buffer. The list of jump-target symbols is kept in the symbol
table, where it is easily accessed.

I am not a mathmatician, but I belive that this algorithm is order O(n).

Its been a while since I've looked at the VAX instruction set. 
Would this algorithm apply there?

Sincerely yours,

Ted Goldstein
Computer Innovations Inc.


Any comments?