Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!samsung!dali.cs.montana.edu!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!unix.cis.pitt.edu!dsinc!netnews.upenn.edu!vax1.cc.lehigh.edu!cert.sei.cmu.edu!krvw From: PHYS169@csc.canterbury.ac.nz (Mark Aitchison, U of Canty; Physics) Newsgroups: comp.virus Subject: Naming/Identifying new viruses (PC) Message-ID: <0001.9102221354.AA15356@ubu.cert.sei.cmu.edu> Date: 21 Feb 91 10:18:00 GMT Sender: Virus Discussion List Lines: 61 Approved: krvw@sei.cmu.edu There is a continual problem with people finding what seems to be a new virus, but not wishing to broadcast the whole (dangerous) boot sector to identify it. I have a solution to the identification problem: I have devised a "hashcode" algorithm specially designed for boot sectors, that gives a reasonably short (12-character) code (of A-Z, 0-9, '#' and '.') that pretty well uniquely identifies a boot sector, is very difficult for virus writers to get around, and is useful in its own right (i.e. you can look at the code and get a reasonable idea of what the boot sector is like). I can let anyone have the source to this (knowing the source doesn't help virus writers), and I'm happy to make it public domain - in fact I hope many people adopt the same standard encoding system. For that reason, I suggest some discussion of the format and method before it is used in a serious way. Briefly, what I have done is to generate code which is... (1) able to be passed through e-mail systems, etc without distortion (2) cannot be used to recreate a live virus (3) is a valid DOS filename, and short enough to say over the telephone easily (4) always starts with the same character, "#", so people can immediately recognise it as a hashcode (5) has a built-in check against typos (including transposition errors), and avoids case distinction or confusing characters (like "/" and "\") (6) is reasonably easy to calculate (7) generates the same code on all systems (e.g. no floating point arithmetic subject to round-off error or different formats on different systems) (8) includes four (and a half) bytes of high-order polynomial checksum, making it difficult for virus writers to give a bad boot sector the same code as a good one. (It would involve very lengthy trial-and-error methods) (9) The last bytes include bit flags, indicating the presence of dubious code of various types, and the absence of important features (such as a reboot), making it useful in itself, and making it even harder for virus writers to circumvent! (10)The size of messages and null bytes and code are also taken into account, since more sneaky viruses will need more code than a good boot sector, so encrypted boot sector viruses would have a tough time getting past!! (11)DOS 4 diskettes (with serial numbers) get the same hashcode, irrespective of serial number (except in a small number of cases, where the serial numbe r happens to contain forbidden instructions). (12)Minor variations of the same virus get similar hashcodes (the last 3 bytes and first 3 bytes should be the same or close). The code is not... A sure-fire way of indicating the presence of a virus. You could simply look at the last byte of the code, and if it isn't '0' than it is probably a virus. Not a great check, but old viruses (including Stoned) are easy to spot that way. Or you could have a list of known good and bad boot sectors, and ring alarms when it isn't a good disk. But that isn't really the aim. It is intended to identify boot sectors, so somebody can say "I know that disk"... whether you are describing the disk over the net or over the phone. I can send the program, BOOTID.PAS to anyone interested via e-mail; hopefully it, and it's big brother (CHECKOUT.EXE) will soon be available via anonymous ftp. Mark Aitchison, Physics, University of Canterbury, New Zealand. Brought to you by Super Global Mega Corp .com