Path: utzoo!mnetor!tmsoft!torsqnt!news-server.csri.toronto.edu!clyde.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!hsdndev!spdcc!ima!iecc!compilers-sender
From: henry@zoo.toronto.edu (Henry Spencer)
Newsgroups: comp.compilers
Subject: Re: Vax span-dependent jumps
Keywords: optimize, code, assembler
Message-ID: <1990Dec30.005507.7842@zoo.toronto.edu>
Date: 30 Dec 90 00:55:07 GMT
References: <9012111913.AA07310@cs-sun-fsa.cpsc.ucalgary.ca> <14692@goofy.megatest.UUCP> <28765@mimsy.umd.edu>
Sender: compilers-sender@iecc.cambridge.ma.us
Reply-To: henry@zoo.toronto.edu (Henry Spencer)
Organization: U of Toronto Zoology
Lines: 38
Approved: compilers@iecc.cambridge.ma.us

In article <28765@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>a special optimization, the assembler will sometimes% use something called
>`branch tunneling'...

Properly known as branch chaining; it's been around since at least the
Bliss-11 compiler.  It wins on code size but typically loses on speed,
especially on modern machines where extra pipeline breaks really hurt.

Our moderator writes:
>[... I wrote the original AIX assembler for the RT PC, another machine with
>span-dependent jumps, by mutating the System III Vax assembler.  The code to
>do span-dependent jumps was pretty straightforward, in pass 1 it assumed
>they'd all be the long form and made a table of all of them, then between
>passes 1 and 2 iterated over that table looking for branches that could be
>converted to the short form...

General comment:  this is theoretically the wrong approach, since there are
sequences where the decisions interact so that branch X can be short only
if branch Y is also short, and a one-at-a-time algorithm that starts with
all of them long won't shorten them.

In practice, almost all branches are short and *any* reasonable algorithm
will work well enough.  Things like iterative scanning of tables are
overkill; a simple FIFO buffer in the output stage, with branches made
long if if they are pushed out the end before their target is known,
will get almost all of them.
-- 
Henry Spencer at U of Toronto Zoology, henry@zoo.toronto.edu   utzoo!henry
[It's true, the scheme that I used in the AIX assembler won't produce optimal
results, but since the code already was there and worked, I went with the
flow.  On the other hand, perfect branch optimization is NP complete so
barring major positive news in the P=NP department, any perfect method is
very slow.  A paper by Tom Szymanski in the CACM ten or so years ago tells
more than you'd ever want to know about the topic, I can dig up the reference
if there's interest. -John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.