Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!gorodish!guy
From: guy@gorodish.Sun.COM (Guy Harris)
Newsgroups: comp.unix.questions
Subject: Re: Unix diff(1) algorithm
Message-ID: <39802@sun.uucp>
Date: 23 Jan 88 07:02:10 GMT
References: <11229@brl-adm.ARPA> <7068@brl-smoke.ARPA> <153@ateng.UUCP>
Sender: news@sun.uucp
Lines: 26

> >The trouble with diff(1) is that it assumes that the only changes are
> >insertions and deletions.  So if you swap two functions, it reports
> >two changes.  The method which appeared in CACM can spot moves as well
> >as insertions and deletions.
> 
> The "hdiff" program does move detection.  It's in the comp.sources.unix
> archives.

That's because, according to the comments, it implements the algorithm
described in the CACM paper, which was:

	"A Technique for Isolating Differences Between Files", Paul Heckel,
	Interactive Systems Consultants, CACM, Volume 21, Number 4, April 1978,
	pp. 264-268.

The "h" stands for "Heckel".  The paper mentions the Hunt/McIlroy paper in
passing, saying:

	The only implementation of (Longest Common Subsequence) for file
	comparison of which the author is aware takes linear time and space for
	most practical cases but takes worst-case O(m*n*log n) time and O(m*n)
	space.  As a practical LCS implementation, it uses several clever
	techniques and is neither concise nor easy to understand.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com