Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!ulysses!allegra!princeton!udel!rochester!cornell!uw-beaver!ssc-vax!cxsea!blm
From: blm@cxsea.UUCP
Newsgroups: comp.unix.questions
Subject: Re: Unix diff(1) algorithm
Message-ID: <2331@cxsea.UUCP>
Date: 16 Jan 88 20:32:01 GMT
References: <11229@brl-adm.ARPA> <1571@aurora.UUCP>
Reply-To: blm@cxsea.UUCP (Brian Matthews)
Organization: Computer X Inc.
Lines: 34
Posted: Sat Jan 16 15:32:01 1988

James A. Woods (jaw@aurora.UUCP) writes:
|also investigate the webb miller / eugene myers code presented in
|"a file comparison program", software practice and experience,
|november 1985. 
|they (miller was acm algorithms editor, now at univ. of arizona)
|claim a factor of four speed increase over standard 'diff' for
|typical files.


I've got a copy of this article, and entered the C implementation
included in the article (OK, I had the secretary here enter it).
Anyways, it does appear to be quite a bit faster than diff (the System V,
Release 2 diff).  The algorithm is a nice algorithm, but the
implementation would need to be tweeked some before I would consider it
usable everyday.  (Note that this should not be construed as a flame
agains the authors of the article.  The implementation is
straight-forward, and demonstrates the algorithm nicely.)  For instance,
it uses a LOT of memory, and when I profiled it, about 94% of the time
was spent in malloc!

One of the things I found most interesting about the algorithm is that it
is basically the same algorithm that Gosling Emacs uses to calculate
optimal screen updates!

If someone plans on implementing a public-domain diff supporting -b and
-c, this algorithm is definitely worth looking at.  Just don't use the
implementation in the article, or spend a lot of time optimizing it and
decreasing it's memory usage.

-- 
Brian L. Matthews                               "A power tool is not a toy.
...{mnetor,uw-beaver!ssc-vax}!cxsea!blm          Unix is a power tool."
+1 206 251 6811
Computer X Inc. - a division of Motorola New Enterprises