Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!tut.cis.ohio-state.edu!rutgers!rochester!pt.cs.cmu.edu!sei!sei.cmu.edu!firth From: firth@sei.cmu.edu (Robert Firth) Newsgroups: comp.lang.misc Subject: Re: Query re optimising compilers Message-ID: <6376@aw.sei.cmu.edu> Date: 26 Jul 88 14:19:32 GMT References: <561@etive.ed.ac.uk> <5423@ihlpf.ATT.COM> Sender: netnews@sei.cmu.edu Reply-To: firth@bd.sei.cmu.edu.UUCP (Robert Firth) Organization: Carnegie-Mellon University, SEI, Pgh, Pa Lines: 58 In article <5423@ihlpf.ATT.COM> nevin1@ihlpf.UUCP (00704a-Liber,N.J.) writes: >In article <561@etive.ed.ac.uk> db@lfcs.ed.ac.uk (Dave Berry) writes: > >|Would it be sensible to optimise the following code: > >|a := function (); >|b := (complicated expression) * a; > >|to the equivalent of: > >|a:= function (); >|if (a == 0) >| b := 0; >|else >| (complicated expression) * a; > >This is not a good optimization, unless the compiler can determine at compile >time that 'a' is much more likely to be 0 than not. Since this is very >difficult to determine at compile time, it is usually not done. This question seems interesting enough to warrant looking at the probable code. If we take the customary Vax, and pick up just after the return from the function, it looks like this: ; result in R0 MOVL R0, A MULL2 Rx, R0 MOVL R0, B Now, with the optimisation (and here I think the compiler can do better than Dave), we have: ; result in R0 MOVL R0, A BEQL 1$ MULL2 Rx, R0 1$: MOVL R0, B So the cost is a single conditional branch not taken, and the benefit is presumably the cost of the "complicated expression". Making a wild guess, the benefit will be at least 10x the cost, (the multiply alone accounts for half that), and substantially greater if the complicated expression includes things like two-dimensional array elements. So it is probably wrong to assert that A should be "much more likely to be 0 than not"; it should be good enough if A is zero 10% of the time. But that still seems an unlikely thing for a compiler to be able to deduce, so the optimisation is still probably unachievable without some clue. In sum, this seems like an excellent example of an optimisation that a good execution profiler would find, but that a compiler cannot. (Note: the above code assumes integer data; the floating code looks almost identical but the benefit is greater)