Path: utzoo!utgpu!water!watmath!clyde!ima!think!barmar
From: barmar@think.COM (Barry Margolin)
Newsgroups: comp.lang.misc
Subject: Re: Poor Algorithms
Message-ID: <17658@think.UUCP>
Date: 8 Mar 88 17:57:29 GMT
References: <3821@ihlpf.ATT.COM> <2791@enea.se> <404@tub.UUCP>
Sender: usenet@think.UUCP
Reply-To: barmar@fafnir.think.com.UUCP (Barry Margolin)
Organization: Thinking Machines Corporation, Cambridge, MA
Lines: 73

In article <404@tub.UUCP> net@tub.UUCP (Oliver Laumann) writes:
>All the anecdotes about poor algorithms remind me of an amusing tale
>I have read in the (highly recommendable, by the way) paper "Hints for
>Computer System Design" by Butler W. Lampson of Xerox Palo Alto.
>
>  The remarkable
>point is that this procedure ran in time O(n^2), where n is the length
>of the document!
>
>This result was achieved by first writing a procedure "FindIthField"
>which finds the i-th field in the document.  This, of course, takes
>time O(n).  The procedure "FindNamedField(name)" was then implemented
>in the following "natural" way [quoted directly from the article]:
>
>   FOR i = 0 to NumberOfFields DO
>      FindIthField; IF its name is "name" THEN EXIT
>   ENDLOOP

Whether they got the algorithm right or not depends on which algorithm
you are talking about.  There is a very similar routine in the
Symbolics Genera user interface, used when you want to retrieve a line
that matches a given string from your input history; the input history
is a linked list, and a very similar loop is used, so it is also N^2.
Such a loop is used because of information hiding; the loop is in the
INTERACTIVE-STREAM flavor, while the linked list is hidden in the
HISTORY flavor, and the interface only provides a routine to get the
Nth history element.

So, is this organization flawed?  I don't think so.  However, the
implementation of the ELEMENTn operation could be improved (in fact,
the reason I even know about this is that someone here patched the
routine years ago in the manner I'm about to describe) to make such a
common operation less expensive.  The flaw in Oliver's argument above
is that FindIthField need not always be O(I).  In particular, if it is
common to request field n+<positive> soon after requesting field n, it
would pay to remember where field n ended, and start from there
instead of from the beginning.  This is a time/space trade-off that
really pays back, because an O(N^2) time algorithm is reduced to O(N)
at the cost of O(1) storage.

So, this is really, in my opinion, an example of the way that
abstraction and information hiding make it easier to improve
performance.  In order to fix this performance bug, all that is needed
is two instance variables (assuming an object-oriented implementation)
and a small change to one method.

In the Symbolics case, an alternate fix would be to use an array
rather than a linked list.  However, since in this case it is more
common to add elements to one end of the history (every time the user
types a line) than to search far back in the history, and in the
upcoming release they will allow selective deletion from the middle,
it should be obvious why the linked list representation was chosen.
But I think they made the right choice in not exposing the
implementation choice in the interface; this allows them to try many
different implementations (others I can think of is are a linked list
of medium-sized arrays and or an array of medium-sized arrays -- the
latter is O(1) for both adding a new element and finding the Nth
element).

I concede that there are cases where the redesign must affect many
levels of the implementation.  This often means that the original
design was severely flawed.  I think a definition I recently saw in
the Object-Oriented Design discussion was that such a design can be
considered good if most specification changes result in modifications
to only one or two classes.  Applying this to the above example, the
addition of the specification "find the result in linear time" results
in a simple change to one class.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar