Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!caip!topaz!ll-xn!mit-amt!mit-eddie!genrad!decvax!mcnc!ecsvax!emigh From: emigh@ecsvax.UUCP (Ted Emigh) Newsgroups: net.sources Subject: MSDOS program to keep track of CRC's of files on a disk Message-ID: <1933@ecsvax.UUCP> Date: Fri, 15-Aug-86 14:25:26 EDT Article-I.D.: ecsvax.1933 Posted: Fri Aug 15 14:25:26 1986 Date-Received: Sun, 17-Aug-86 10:30:34 EDT Reply-To: emigh@ecsvax.UUCP (Ted Emigh) Organization: NC State University Lines: 1219 The following is a pair of programs to calculate the CRC's of all the files on a disk, and to compare this list to a previously generated list. This allows you to keep track of files that have been added, deleted, modified, or trashed. The programs are written to be compiled under TURBO Pascal (ver 3.0, although I haven't tried earlier versions). See the documentation for more information. The three files this archive will produce are : filecrc.doc, filecrc.pas, and compare.pas. ------------------------------CUT HERE---------------------------------- #----------------------------------------------------------- # # filecrc.ar # use 'sh filename.ar' to extract files from this archive # # Do not use csh. # echo "Extracting filecrc.doc <-- filecrc.ar" cat << \===filecrc.doc=== > filecrc.doc FILECRC 13 August 1986 Ted H. Emigh FILECRC is a program to help detect when files have been corrupted. FILECRC creates a list of all the files on the default drive along with creation date, file size, and a CRC (cyclic redundancy check) for each file. When FILECRC is run again the new list is compared with the old list. For any file, it is possible that: 1) The file is completely unchanged from the previous time. The file name (and directory entry) are the same at the two times, and it has not been modified. 2) The file has been modified in the normal manner, so that the directory entry has a new time of creation. Files of this sort are counted, but no special treatment is given to them. 3) The file has been deleted in the time since the first time FILECRC was run. Files of this sort are counted, but no special treatment is given to them. 4) A new file has appeared that was not on the disk at the time of the previous run of FILECRC. Files of this sort are counted, and a list is placed in the file FILES$$$.NEW. While it is usual to find new files on the disk, this gives an easy way to keep track of what files are new, and where they are located. This is important when using public domain programs to make sure they are not creating new files without you knowing about it. 5) The directory entry for a file is the same for both of the times the program was run, but the file was modified in some way. This should not occur in normal practice, so the program writes a message to the terminal, and a list of these files is placed in the file FILES$$$.MOD. This can occur when you use NORTON UTILITIES, or other such programs to modify the disk directly, bypassing the normal DOS handling of the files. It also can happen when programs 'run wild' (this is what prompted me to write this program in the first place). Running the program prior to each backup will assure you that you are not backing up files that have been corrupted. Also, in program development, running the program before and after a test run of your program can assure you that your program has not messed up the disk. 1 RUNNING FILECRC There are three files associated with FILECRC: FILECRC.COM -- The main program. COMPARE.CHN -- The comparison program overlay. COMPARE.COM -- A stand-alone version of the comparison program. FILECRC is run without command line parameters (although output redirection is permitted). It will create CHECK$$$.NEW (or CHECK$$$.CRC if the file does not exist in the default directory), which is a list of all the files on the default disk in all directories. FILECRC displays the directory names as it goes through them. FILECRC will then call COMPARE, which will compare the files in CHECK$$$.NEW with those in CHECK$$$.CRC, noting any differences. When COMPARE is finished, the old file list now will be called CHECK$$$.OLD, and the newly created one will be called CHECK$$$.CRC. COMPARE can be run as a stand alone program by typing COMPARE [NEWLIST.FIL [OLDLIST.FIL]] If NEWLIST.FIL is given, this will be used instead of CHECK$$$.NEW,and if given, OLDLIST.FIL will be used instead of CHECK$$$.CRC. For example, COMPARE CHECK will check the file CHECK with CHECK$$$.CRC. If NEWLIST.FIL is given, CHECK$$$.CRC will not be renamed. Any files created since the previous time FILECRC was run will be listed in the file FILES$$$.NEW, and any files that have been modified in a "NON DOS" manner will be listed in the file FILE$$$.MOD. 2 PROGRAMMING NOTES FILECRC is written using Turbo Pascal, Version 3.0 for MSDOS. It has been tested on an IBM PC/AT using DOS 3.10. This program is not meant to represent the epitome of programming skill, but it works. Any improvements and suggestions are welcome, particularly if you can improve the speed. On my PC/AT with some 730 files occupying 18MB the program takes about 6 minutes to complete. I am convinced that FILECRC.COM cannot be improved significantly on speed (take that as a challenge,if you wish), but COMPARE.CHN and COMPARE.COM are relatively inefficient (but then of the 6 minutes, 5-1/2 minutes are spent in FILECRC.COM). Programming notes in the programs are sparse, but I specifically set separate routines for handling each of the the file comparison types in COMPARE (use the procedures file_new, file_updated, file_OK, and bad_CRC if you would like to do something special for each file comparison type). FILECRC will work with any number of files or directories. As written, COMPARE has a maximum of 200 directories and 1900 files with any number of files within any particular directory. The maximum length of the directory name string is 64 characters. I have used the program on subdirectories up to 10 levels deep without any problems. These values for the number of directories and the number of files uses up just about as much memory as TURBO Pascal allows, so an increase in these numbers would necessitate a redesign of the program. Special thanks go to David Dantowitz of Digital Equipment Corporation (Dantowitz%eagle1.dec@decwrl) for providing the CRC routines (generate_table_256 and crc_string_256) and the routines for getting a directory (get_DTA, set_DTA, find_first, and find_next). Of course, he takes no responsibility for the way I used his code. Ted H. Emigh Department of Genetics North Carolina State University Box 7614 Raleigh, NC 27695-7614 emigh@ecsvax.uucp NEMIGH@TUCC.BITNET 3 ===filecrc.doc=== # ---------- echo "Extracting filecrc.pas <-- filecrc.ar" cat << \===filecrc.pas=== > filecrc.pas { PROGRAM TO CREATE OF FILE OF THE CRC'S OF THE FILES ON THE DEFAULT DISK } { This program was written by Ted H. Emigh, and has been placed in the public domain, to be used at the user's discretion. The CRC routines and the discussion of the CRC were written by David Dantowitz, Digital Equipment Corporation, Dantowitz%eagle1.dec@decwrl. This program calculates the CRC (cyclic redundancy check) for all the files on the disk (with the exception of files that are hidden system files). The CRC's are placed in a file (CHECK$$$.NEW) to be compared with the CRC's calculated at a previous time in the file CHECK$$$.CRC. The comparison is done with the program COMPARE.PAS. This program is set to automatically chain to COMPARE.PAS to automate the procedure, but this can be turned off by deleting the lines: Assign (chain_file,'COMPARE.CHN'); Chain(chain_file); at the end of this program. For a good discussion of polynomial selection see "Cyclic Codes for Error Detection", by W. W. Peterson and D. T. Brown, Proceedings of the IEEE, volume 49, pp 228-235, January 1961. A reference on table driven CRC computation is "A Cyclic Redundancy Checking (CRC) Algorithm" by A. B. Marton and T. K. Frambs, The Honeywell Computer Journal, volume 5, number 3, 1971. Also used to prepare these examples was "Computer Networks", by Andrew S. Tanenbaum, Prentice Hall, Inc. Englewood Cliffs, New Jersey, 1981. The following three polynomials are international standards: CRC-12 = X^12 + X^11 + X^3 + X^2 + X^1 + 1 CRC-16 = X^16 + X^15 + X^2 + 1 CRC-CCITT = X^16 + X^12 + X^5 + 1 In Binary and hexadecimal : Binary Hex CRC-12 = 1111 0000 0001 $0F01 CRC-16 = 1010 0000 0000 0001 $A001 CRC-CCITT = 1000 0100 0000 1000 $8404 (Used below) The first is used with 6-bit characters and the second two with 8-bit characters. All of the above will detect any odd number of errors. The second two will catch all 16-bit bursts, a high percentage of 17-bit bursts (~99.997%) and also a large percentage of 18-bit or larger bursts (~99.998%). The paper mentioned above (Peterson and Brown) discusses how to compute the statistics presented which have been quoted from Tanenbaum. (A burst of length N is defined a sequence of N bits, where the first and last bits are incorrect and the bits in the middle are any possible combination of correct and incorrect. See the paper by Peterson and Brown for more information) } {$G512,P512,U+,R+ } Program FILECRC; Const BufSize = 192; { Number of 128 byte sectors in the CRC buffer } Buffer_Length = 24576; { BufSize * 128 = Length of the CRC buffer } Version = 1.00; Version_Date = '13 AUG 86'; POLY = $8404; { CRC Polynomial Used } Type Bytes = Array [1..24576] of Byte; { Length is 1..Buffer_Length } Registers = record { Registers for 8088/8086/80286 } ax, bx, cx, dx, bp, si, di, ds, es, flags : integer; end; DTA_record = record { DTA as used by MSDOS } dos : array [1..21] of char; attribute : byte; { Attribute byte } time_of_day : integer; { Time of Day of File Creation } date : integer; { Date of File Creation } low_size, high_size : integer; { Size of the File } filename: array [1..13] of char; { File Name } junk : array [1..85] of byte; end; string255 = string[255]; Var { Variables used in Calculating the CRC } str_length, RecsRead, CRC_value : integer; table_256 : Array [0 .. 255] of Integer; {CRC Table to speed computations} byte_string : Bytes; { Variables used in setting up the input and output files } filvar : file; chain_file : file; outfile : TEXT[$4000]; check_crc : boolean; { Misc. Variables } root : string255; { Contains the default drive and root directory } global_reg : registers; { Registers for the DOS calls } Procedure generate_table_256(POLY : Integer); { This routine computes the remainder values of 0 through 255 divided by the polynomial represented by POLY. These values are placed in a table and used to compute the CRC of a block of data efficiently. More space is used, but the CRC computation will be faster. This implementation only permits polynomials up to degree 16. } Var val, i, result : Integer; Begin For val := 0 to 255 Do Begin result := val; For i := 1 to 8 Do Begin If (result and 1) = 1 then result := (result shr 1) xor POLY else result := result shr 1; End; table_256[val] := result; End End; Function crc_string_256(Var s : Bytes; s_length, initial_crc : Integer) : Integer; { This routine computes the CRC value and returns it as the function value. The routine takes an array of Bytes, a length and an initial value for the CRC. The routine requires that a table of 256 values be set up by a previous call to Generate_table_256. This routine uses table_256. } Begin inline( $c4/$7e/