Path: utzoo!attcan!uunet!samsung!uakari.primate.wisc.edu!caen!willow.engin.umich.edu!mrice From: mrice@caen.engin.umich.edu (Michael Rice) Newsgroups: comp.os.msdos.programmer Subject: Unique file determination Summary: How do I determine if two files are the same? Keywords: Turbo C Checksums Message-ID: <1990Aug9.193919.2996@caen.engin.umich.edu> Date: 9 Aug 90 19:39:19 GMT Sender: news@caen.engin.umich.edu (CAEN Netnews) Organization: University of Michigan Engineering, Ann Arbor Lines: 16 What I want to do is store some type of file indentifier for each file so when I get a new file I can create this identified and check it against the identifiers for my other files. If they match then I would know that file is a duplicate. They way I implemented this (the first thing that came into my mind) was to add up the first X number of bytes and use this as the indentified (checksum?). When this X is about 1000 bytes I found 5 duplicates that were duplicates and 1 set that were not duplicates. This was with about 100 files. Any ideas on a better way to do this, more accurate, etc? I am doing this in Turbo C if that matters. Any help is appreciated. Mike