Path: utzoo!mnetor!uunet!husc6!psuvax1!gondor.cs.psu.edu!schwartz
From: schwartz@gondor.cs.psu.edu (Scott E. Schwartz)
Newsgroups: comp.unix.questions
Subject: Re: Unix diff(1) algorithm
Message-ID: <3219@psuvax1.psu.edu>
Date: 15 Jan 88 21:37:41 GMT
References: <11229@brl-adm.ARPA> <1571@aurora.UUCP>
Sender: netnews@psuvax1.psu.edu
Reply-To: schwartz@gondor.cs.psu.edu (Scott E. Schwartz)
Organization: Penn State University, University Park, PA
Lines: 23
Summary: more on miller/myers algorythm

In article <1571@aurora.UUCP> jaw@aurora.UUCP (James A. Woods) writes:
>also investigate the webb miller / eugene myers code presented in
>"a file comparison program", software practice and experience,
>november 1985. 
>
>they (miller was acm algorithms editor, now at univ. of arizona)
>claim a factor of four speed increase over standard 'diff' for
>typical files.

Webb is actually at Penn State.  Gene Myers is at Arizona, except that
he is here this year.

Recently they came up with an enhancement to their algorythm
("Haslock's refinement") that doubles (in the worst case) the speed
of the original algorythm.  Best case (in which all edits are insertions or
are all deletions) runs in linear time.

To be fair, there are cases where diff is faster.  Diff does best when
the longest common subsequence is short, i.e. lots of diffs.  Fcomp
(as they call it) does best when the edit script is short, i.e. few diffs.


-- Scott Schwartz            schwartz@gondor.cs.psu.edu