Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2(pesnta.1.2) 9/5/84; site scc.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!zehntel!dual!amd!pesnta!scc!ted From: ted@scc.UUCP (Ted Goldstein) Newsgroups: net.lang Subject: Re: Re: Optimization technique wanted... Message-ID: <258@scc.UUCP> Date: Tue, 27-Nov-84 01:01:11 EST Article-I.D.: scc.258 Posted: Tue Nov 27 01:01:11 1984 Date-Received: Wed, 28-Nov-84 05:17:43 EST References: <438@ima.UUCP> Organization: Santa Cruz Computer, Inc, Aptos, Calif. Lines: 48 > Long vs. short branch optimization is an interesting problem. It turns out > that generating perfectly optimal branches is NP-complete, i.e. there's no > known method that's better than enumerating the possibilities, although it's > possible that there is one that we haven't found one yet. (If you find one, > please let me know. You could even call collect.) > > Fortunately, there are practical nearly-optimal schemes for real programs. > One was written up in CACM a few years ago. In general, you start by assuming > that all of your branches might have to be long, although you pretend > that they're all short and assign tentative addresses that way. Then you > interate over a table of all of the branch instructions that you have to make > a decision about, marking those that can definitely be short, even if all of > the undecided branches in their span turned out to be long. When you find > that you can't mark any more, the unmarked ones are long and the marked ones > are short. > > The Vax Unix assembler does this, and it usually turns out that 2 or 3 passes > over the table of branches do the trick. This approach sometimes misses a > marginal case, but that's better than overoptimizing and generating bad > code. I recommend the CACM article if you want to understand it better. > > John Levine, ima!johnl Well, it depends on the architecture. On the 8086 and family, where jumps are relative 8 or 16 bit branch to the Program Counter register, we only have to keep the previous 256 bytes worth of instructions in a ring buffer. A jump is first considered to be a long jump. If its target appears before the jump leaves the ring buffer, it collapses to a short jump. The test for collapsing it occurs only on entry and on exit to and from the ring buffer. The list of jump-target symbols is kept in the symbol table, where it is easily accessed. I am not a mathmatician, but I belive that this algorithm is order O(n). Its been a while since I've looked at the VAX instruction set. Would this algorithm apply there? Sincerely yours, Ted Goldstein Computer Innovations Inc. Any comments?