Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!ulysses!allegra!princeton!udel!rochester!cornell!uw-beaver!ssc-vax!cxsea!blm From: blm@cxsea.UUCP Newsgroups: comp.unix.questions Subject: Re: Unix diff(1) algorithm Message-ID: <2331@cxsea.UUCP> Date: 16 Jan 88 20:32:01 GMT References: <11229@brl-adm.ARPA> <1571@aurora.UUCP> Reply-To: blm@cxsea.UUCP (Brian Matthews) Organization: Computer X Inc. Lines: 34 Posted: Sat Jan 16 15:32:01 1988 James A. Woods (jaw@aurora.UUCP) writes: |also investigate the webb miller / eugene myers code presented in |"a file comparison program", software practice and experience, |november 1985. |they (miller was acm algorithms editor, now at univ. of arizona) |claim a factor of four speed increase over standard 'diff' for |typical files. I've got a copy of this article, and entered the C implementation included in the article (OK, I had the secretary here enter it). Anyways, it does appear to be quite a bit faster than diff (the System V, Release 2 diff). The algorithm is a nice algorithm, but the implementation would need to be tweeked some before I would consider it usable everyday. (Note that this should not be construed as a flame agains the authors of the article. The implementation is straight-forward, and demonstrates the algorithm nicely.) For instance, it uses a LOT of memory, and when I profiled it, about 94% of the time was spent in malloc! One of the things I found most interesting about the algorithm is that it is basically the same algorithm that Gosling Emacs uses to calculate optimal screen updates! If someone plans on implementing a public-domain diff supporting -b and -c, this algorithm is definitely worth looking at. Just don't use the implementation in the article, or spend a lot of time optimizing it and decreasing it's memory usage. -- Brian L. Matthews "A power tool is not a toy. ...{mnetor,uw-beaver!ssc-vax}!cxsea!blm Unix is a power tool." +1 206 251 6811 Computer X Inc. - a division of Motorola New Enterprises