Path: utzoo!mnetor!uunet!husc6!psuvax1!gondor.cs.psu.edu!schwartz From: schwartz@gondor.cs.psu.edu (Scott E. Schwartz) Newsgroups: comp.unix.questions Subject: Re: Unix diff(1) algorithm Message-ID: <3219@psuvax1.psu.edu> Date: 15 Jan 88 21:37:41 GMT References: <11229@brl-adm.ARPA> <1571@aurora.UUCP> Sender: netnews@psuvax1.psu.edu Reply-To: schwartz@gondor.cs.psu.edu (Scott E. Schwartz) Organization: Penn State University, University Park, PA Lines: 23 Summary: more on miller/myers algorythm In article <1571@aurora.UUCP> jaw@aurora.UUCP (James A. Woods) writes: >also investigate the webb miller / eugene myers code presented in >"a file comparison program", software practice and experience, >november 1985. > >they (miller was acm algorithms editor, now at univ. of arizona) >claim a factor of four speed increase over standard 'diff' for >typical files. Webb is actually at Penn State. Gene Myers is at Arizona, except that he is here this year. Recently they came up with an enhancement to their algorythm ("Haslock's refinement") that doubles (in the worst case) the speed of the original algorythm. Best case (in which all edits are insertions or are all deletions) runs in linear time. To be fair, there are cases where diff is faster. Diff does best when the longest common subsequence is short, i.e. lots of diffs. Fcomp (as they call it) does best when the edit script is short, i.e. few diffs. -- Scott Schwartz schwartz@gondor.cs.psu.edu