Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!shelby!agate!dog.ee.lbl.gov!elf.ee.lbl.gov!torek
From: torek@elf.ee.lbl.gov (Chris Torek)
Newsgroups: comp.lang.c
Subject: Re: micro-optimizing loops (was Help with casts)
Message-ID: <10250@dog.ee.lbl.gov>
Date: 25 Feb 91 19:46:16 GMT
References: <1991Feb21.040145.8678@cec1.wustl.edu> <409@ceco.ceco.com> <10191@dog.ee.lbl.gov> <344@smds.UUCP>
Reply-To: torek@elf.ee.lbl.gov (Chris Torek)
Organization: Lawrence Berkeley Laboratory, Berkeley
Lines: 26
X-Local-Date: Mon, 25 Feb 91 11:46:16 PST

(with any luck this will die its own death after this...)

In article <344@smds.UUCP> rh@smds.UUCP (Richard Harter) writes:
>For reasons that are not clear to me many optimizing compilers will not
>collapse the two machine instructions
>		dec r1
>		bge 1$
>into the available single instruction to do the same thing.  Perhaps
>some of our compiler writers can explain this to us.

Certain machines (grr :-) ) that have subtract-and-branch-on-condition
instructions can only branch a very short distance; compilers for these
must figure out how far the branch goes, or else use assembler pseudo
ops like `jsobgtr' which expand if necessary.  Unfortunately, the VAX
(for one) assemblers tend not to have `jsobgtr' pseudo-ops.  `Fixed in
the next release....'

(Incidentally, I played with timing decl+jgeq vs sobgeq on the VAX and
found that it rarely made any difference.  It is more compact, which
does not hurt, but not really any faster.  Other `fancy' VAX
instructions also turn out to be slower than equivalent sequences of
simpler instructions.  Which ones, and how much, depend on the
particular model: 780s and 8250s have fairly different characteristics.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov