Xref: utzoo comp.unix.questions:28482 comp.lang.perl:3890 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Newsgroups: comp.unix.questions,comp.lang.perl Subject: Re: Need help with error correction. Message-ID: <11333@jpl-devvax.JPL.NASA.GOV> Date: 6 Feb 91 23:42:58 GMT References: <1991Feb6.142829.20725@dg-rtp.dg.com> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 41 In article meissner@osf.org (Michael Meissner) writes: : In article <1991Feb6.142829.20725@dg-rtp.dg.com> : hunt@dg-rtp.rtp.dg.com (Greg Hunt) writes: : : | When I've had file transmission problems, I've used sum(1) to produce : | a checksum of the file on both the sending side machine and the : | receiving side machine and compared the results. If they weren't the : | same, then I knew that something got corrupted in the transmission and : | I got the file again. : | : | If the systems you're working with have sum(1) that might be an easy : | thing to use. Also, sum(1) will work for any sort of file, it doesn't : | just have to be text (which is the only thing diff(1) can look at). : : The only hitch is that sum(1) produces different results on System V : based systems and Berkeley based systems. I think sum -r on System V : gives the BSD behavior. Since this is cross-posted to comp.lang.perl, I suppose it's okay for me to mention that you can emulate System V sum with #!/usr/bin/perl undef $/; while (<>) { print unpack("%32C*", $_) % 65535, " ", int((length()+511)/512), " $ARGV\n"; } The Book, by the way, is wrong when it says you can emulate sum with "%16C*". That is only guaranteed to work on files less than 256 bytes long (512 if there are not eighth bits). Teach me to choose my test cases better... No, I didn't have any sources to consult. The man page says sum does a 16-bit checksum, and it lies. It does modulo 65535 (not 65536). Ah well. The above code will only work on files up to 2**24 bytes long or so. Some machines may need to change the "%32C*" to "%31C*" until 4.0 comes out, since some machines think that 1 << 32 == 1, GRRR! I won't mention any names, because I don't want to get sun4's into trouble... :-) Larry