Path: utzoo!news-server.csri.toronto.edu!rutgers!uwm.edu!ogicse!milton!newton!lgy From: lgy@phys.washington.edu (Laurence G. Yaffe) Newsgroups: comp.lang.perl Subject: Dcmp - recursive directory comparison Summary: recursive, flexible directory comparison program Message-ID: Date: 12 Mar 91 21:11:08 GMT Sender: news@milton.u.washington.edu Lines: 308 I frequently use a recursive directory comparison program to compare the contents of entire directory trees. I recently decided to rewrite my previous shell based 'dcmp' program in Perl (and of course, add various bells and whistles). Appended below is the result (my first serious Perl programming). If others find it useful, great. Comments on Perl programming style welcome. By default, this program reports on files present in one directory tree, but absent in the other, and files present in both trees but with differing contents, ownerships or permissions. 'Diff' may run on differing text files to display differences. Its output is designed to be terse but informative. (I find the sysV "dircmp" unpleasantly verbose and insufficiently flexible.) It's wrapped with its own man page (but not shar'ed). Enjoy -------------------------------------------------------------------------- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet -------------------------------------------------------------------------- #!/usr/local/bin/perl 'di'; 'ig00'; # # Usage: dcmp [options] dir1 dir2 [pattern] # # Compare all (or selected files in) two directories trees # # Options: -r Don't recursively descend into subdirectories. # -a Don't report absent files. # -o Don't report differing ownerships. # -p Don't report differing permissions. # -t Don't report differing file types. # -f Don't follow symbolic links. (Default is to follow.) # -n Don't run 'diff' on differing text files, w/o querying. # -y Do run 'diff' on differing text files, w/o querying. # -q Query about running 'diff', even if stdin/out not ttys. # -v Run 'verbosely'. # # Laurence G. Yaffe # lgy@phys.washington.edu # 3/12/91 # # Check args && fetch directory names. require "stat.pl" ; require "getopts.pl" ; $options = 'afnopqrtvy' ; (&Getopts ($options) && (@ARGV >= 2)) || die "Usage: $0 [-$options] dir1 dir2 [pattern]\n"; $statcmd = ($opt_f) ? 'lstat' : 'stat' ; $rpt_absent = !$opt_a ; $rpt_owner = !$opt_o ; $rpt_perm = !$opt_p ; $rpt_type = !$opt_t ; $recurse = !$opt_r ; $diff_all = $opt_y ; $verbose = $opt_v ; $query = $opt_q || (!$opt_y && !$opt_n && (-t STDIN && -t STDOUT)) ; $pat = splice (@ARGV,2,1) || '^' ; &dcmp (@ARGV) ; sub gripe { local ($cond,$file,$left,$right) = @_ ; local ($lt,$rt) ; if (length ($file) < 20) { $lt = -16 ; $rt = -19 ; } else { $lt = -7 ; $rt = 28 ; } printf (" %-7s %-${lt}s %${rt}s %s\n", $cond, $left, $file, $right) ; } sub dcmp { local ($dir1, $dir2) = @_; local (@recurse,$_) ; undef (%files) ; return unless opendir(DIR1,$dir1) || !warn " Can't open directory $dir1\n"; return unless opendir(DIR2,$dir2) || !warn " Can't open directory $dir2\n"; $rem = (length ($dir1) > 37) ? 0 : 30 ; printf ("Comparing %-37s %-$rem.${rem}s\n", $dir1, $dir2) ; while ($_ = readdir (DIR1)) { $files {$_}++ ; } while ($_ = readdir (DIR2)) { $files {$_}+=2 ; } closedir (DIR1) ; closedir (DIR2) ; if ($rpt_absent) { foreach (sort grep ($files {$_} != 3 && /$pat/o, keys (%files))) { if ($files{$_} == 1) { &gripe ('absent:', $_, '::::::', '') ;} else { &gripe ('absent:', $_, '', '::::::') ;} } } $dir1 .= '/' ; $dir2 .= '/' ; compare: foreach $file (sort grep ($files {$_} == 3, keys (%files))) { next if $file eq "." ; next if $file eq ".." ; $SIG {'INT'} = 'interrupt' ; $flag = &comp ($dir1, $dir2, $file) ; push (@recurse, $file) if ($flag > 0) ; &interrupt if ($flag < 0) ; sub interrupt { if (-t STDIN && -t STDOUT) { print "\nInterrupt: quit, continue, ascend? " ; $reply = ; next if $reply =~ /^c/io ; @recurse = () ; last compare if $reply =~ /^a/io ; } exit 1 ; } } if ($recurse) { foreach $file (@recurse) { &dcmp ($dir1.$file, $dir2.$file, '') ; } } } sub comp { local ($dir1, $dir2, $file) = @_ ; local ($file1) = $dir1 . $file ; local ($file2) = $dir2 . $file ; ($type1, $mod1, $uid1, $gid1, $size1) = &filetype ($file1) ; ($type2, $mod2, $uid2, $gid2, $size2) = &filetype ($file2) ; if ($file =~ /$pat/o) { $ne = 0 ; $ne = &gripe ('owners:', $file, $uid1.'/'.$gid1, $uid2.'/'.$gid2) if ($rpt_owner && ($uid1 != $uid2 || $gid1 != $gid2)) ; $ne = &gripe ('types:',$file,sprintf("%lo",$mod1),sprintf("%lo",$mod2)) if ($rpt_type && ($mod1 | 07777) != ($mod2 | 07777)) ; $ne = &gripe ('perms:',$file,sprintf("%lo",$mod1),sprintf("%lo",$mod2)) if ($rpt_perm && ($mod1 & 07777) != ($mod2 & 07777) && ($mod1 | 07777) == ($mod2 | 07777)) ; if ($type1 eq 'f' && $type2 eq 'f') { $differ = ($size1 != $size2) || &cmp ($file1, $file2) ; if ($differ) { if ($differ > 0 && ($do_diff = $diff_all) || $query) { if ($query) { printf (" %-25s %s ? ", 'diff [ny]', $file) ; $reply = ; exit 0 if $reply =~ /^q/io ; $do_diff = $reply =~ /^y/io ; return 0 if !$do_diff ; } } else { $do_diff = 0 ; } if ($do_diff) { return -1 if (system ("diff $file1 $file2") & 255) ; } else { &gripe ('differ:', $file, '', '') ; } } elsif ($verbose && !$ne) { &gripe ('same:', $file, '', '') ; } return 0 ; } } ($type1 eq 'd' && $type2 eq 'd') ; } sub cmp { local ($FILE1, $FILE2) = @_ ; if (!open FILE1) { warn " Can't open $FILE1\n" ; } elsif (!open FILE2) { warn " Can't open $FILE2\n" ; } else { while () { $len1 = sysread (FILE1, $buf1, 16384) ; $len2 = sysread (FILE2, $buf2, 16384) ; if ((!defined $len1) || (!defined $len2)) { next if $! =~ /^Interrupted/ ; warn " System read error: $!\n" ; last ; } last if ($len1 == 0 && $len2 == 0) ; next if ($len1 == $len2 && $buf1 eq $buf2) ; $wierd = ($buf1 =~ tr/\000\200-\255/\000\200-\255/) ; return $wierd ? -1 : 1 ; } } 0 ; } sub filetype { local ($file) = @_ ; local (@st) = eval "$statcmd (\$file)" ; local ($type) = -d _ ? 'd' : -f _ ? 'f' : -p _ ? 'p' : -S _ ? 'S' : 's' ; return ($type, $st[$ST_MODE], $st[$ST_UID], $st[$ST_GID], $st[$ST_SIZE]) ; } ######################## .00; 'di .nr nl 0-1 .nr % 0 '; __END__ .TH DCMP 1 "March 7, 1991" .AT 3 .SH NAME dcmp \- recursive directory comparison .SH SYNOPSIS .B dcmp [options] dir1 dir2 [pattern] .SH DESCRIPTION .I Dcmp compares all (or selected files within) two directories trees. By default, .I dcmp reports files which are present in one directory tree but not the other, and files present in both trees with differing contents, permissions, or ownerships. When run from a terminal, the user is queried about each differing text file, and may choose to view the differences (produced by 'diff'). The optional .I pattern may be any regular expression (in the style of 'ed' or 'perl'). If given, only files with names matching the pattern are compared. The default bahavior may be modified by the following options: -r Don't recursively descend into subdirectories. -a Don't report absent files. -o Don't report differing ownerships. -p Don't report differing permissions. -t Don't report differing file types. -f Don't follow symbolic links. (Default is to follow symbolic links.) -n Don't run 'diff' on differing text files. -y Do run 'diff' on differing text files. -q Query about running 'diff', even if stdin or stdout are not connected to ttys. -v Run 'verbosely'. Report identical files. .SH ENVIRONMENT No Environment variables used. .SH FILES None. .SH AUTHOR Laurence G. Yaffe (lgy@phys.washington.edu) .SH "SEE ALSO" dircmp(1) .SH DIAGNOSTICS Complains if files or directories are unreadable. -- -------------------------------------------------------------------------- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet