Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!cbatt!ucbvax!MIMSY.UMD.EDU!dave From: dave@MIMSY.UMD.EDU.UUCP Newsgroups: mod.ai Subject: Re: Dear Abby, Analysis of unknown data. Message-ID: <8703091401.AA12311@mimsy.umd.edu> Date: Mon, 9-Mar-87 09:01:51 EST Article-I.D.: mimsy.8703091401.AA12311 Posted: Mon Mar 9 09:01:51 1987 Date-Received: Sat, 14-Mar-87 10:21:40 EST References: Sender: daemon@ucbvax.BERKELEY.EDU Distribution: world Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 37 Approved: ailist@sri-stripe.arpa >I guess the idea here is to come up with an expert version of the UNIX file >program. The problem with the `file' approach is that it assumes one has already a knowledge of the "files" he is attacking. So, this technique might become more and more useful, but only "might". >One of the first things to realize is that there are files for >which your system is not going to be able to come up with any >useful information. Try feeding it 156MB of perfectly random >numbers for example. Testing for randomness might be the first test; sure would save a lot of subsequent computing if it were random. >files. Optionally, the program could try and deduce all the information >desired from the file, but I think that would be much more difficult to do. Yep. It would be nice to take a goal-driven, top-down approach, but sometimes data-driven inference, e.g., auto-correlation, is what there is. >representation is derived from firing up the appropriate program on the file. >For example, if you are trying to classify a system executable, you will want >to run the system debugger (or disassembler) on the file. There is an >assumption here that files don't exist in a vacuum. If they did, they would >be useless. Their uselessness and whether they exist in a vacuum is an assumption. -- Dave Stoffel (703) 790-5357 seismo!mimsy!dave dave@Mimsy.umd.edu Amber Research Group, Inc.