Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!mcsun!hp4nl!kunivv1!root
From: root@kunivv1.sci.kun.nl (Privileged Account)
Newsgroups: comp.lang.misc
Subject: Re: Common subexpression optimization
Message-ID: <1041@kunivv1.sci.kun.nl>
Date: 13 Feb 90 14:11:53 GMT
References: <4561@scolex.sco.COM> <14214@lambda.UUCP> <PCG.90Jan29131938@rupert.cs.aber.ac.uk> <2217@sunset.MATH.UCLA.EDU> <838.18:06:33@stealth.acf.nyu.edu> <2115@castle.ed.ac.uk> <5453:23:28:32@stealth.acf.nyu.edu>
Organization: University of Nijmegen, The Netherlands
Lines: 76

(Dan Bernstein) writes:
>(Nick Rothwell) writes:
>> (Dan Bernstein) writes:
>> >Piercarlo is entirely correct. If you care about ``information hiding''
>> >and the ``purity'' of your code more than efficiency, you can't expect
>> >fast code. If you can't bear to put your original code in comments and
>> >write an efficient version with temporary variables, you don't deserve
>> >fast code.
>> 
>> This is rubbish, check out the performance of New Jersey ML
>> (high-level functional language, efficient optimised native code
>> generation).
>
>Are you saying that your compiler is smarter than you are or just that
>it's not as lazy?
>
>Can it figure out, say, Knuth's algorithm for computing the delta2 table
>in Boyer-Moore string searching, starting from a dumb algorithm for the
>same job?
>
After some discussion with my colleague Norbert V\"olker (who doesn't
read news), I can come up with the following reply:

- first, Knuth's algorithm as published in [1] is *WRONG*. This already
shows that not even DEK, who is an *extremely* competent programmer, to my
knowledge, was able to come up with the algorithm. See [2] for the
improved algorithm.

- second, N. V\"olker is currently working on a (transformational) derivation
of the delta-algorithm for BM pattern matching (with H. Partsch). Their
version of the algorithm is going to be different from the delta2
algorithm in [2], it uses more space and less time.
It is Norbert's feeling, however, that in the future compilers will be
able to come up with the delta2 algorithm, since all that is involved in
a derivation by hand is finite differencing (aka strength reduction).
The reason that current compilers are not able to do this is that they
do not ``know'' what lists (strings, sequences, or whatever) *are*. Cf.
[3] for the basic laws about strings.

My view of this whole issue is the following:
- there are two kinds of optimisations:
  routine ones and smart ones.
  Routine optimisations are:
  strength reduction, code motion, loop merging, (`bout everything you
  heard about in your course on Optimizing Compilers)
  Smart optimisations are inventions of new algorithms, like the BM
  pattern matching algorithm *itself*, the non-trivial sorting
  algorithms (cf. Darlington's paper `A Synthesis of Several Sorting
  Algorithms')
  The only problem I don't know for sure how to place is register
  allocation.
- the routine optimisations can be done (eventually) by a compiler; all
the examples of things that `you'd better do yourself because your
compiler is too stupid' mentioned so far fall in this class.
- the smart optimisations can only be done by humans because they
involve ``eureka's'' and design decisions; it is, however, possible
to limit the number of eureka's by inventing strategies (cf. some work
by D.R.Smith and A.Pettorossi) and by gaining knowledge about your
problem domain first (e.g. lists, cf. [3]).

References:
[1] Knuth, Morris, Pratt: Fast Pattern Matching in Strings, Siam J.
Comput. 6, 1977.
[2] Rytter: A correct preprocessing algorithm for Boyer-Moore
string-searching, Siam J. Comput. 9, 1980.
[3] R.S. Bird: An introduction to the theory of lists, in: M.Broy (ed.):
Logic of Programming and Calculi of Discrete Design, ASI Series Vol.
F36, Berlin:Springer 1987.

I'm very sorry for you, mr. Bernstein, that you chose an example that
proves my (and Nick Rothwell's) point.


Eerke Boiten, Department of Informatics, K.U.Nijmegen
Toernooiveld, 6525 AD Nijmegen, The Netherlands.
Tel. +31-80-612236.     eerke@cs.kun.nl