Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!ucbcad!ucbvax!INGRES.BERKELEY.EDU!hatcher From: hatcher@INGRES.BERKELEY.EDU (Doug Merritt) Newsgroups: comp.sys.amiga Subject: Further file format problems Message-ID: <8705250320.AA18887@ingres.Berkeley.EDU> Date: Sun, 24-May-87 23:20:11 EDT Article-I.D.: ingres.8705250320.AA18887 Posted: Sun May 24 23:20:11 1987 Date-Received: Mon, 25-May-87 04:44:28 EDT Sender: daemon@ucbvax.BERKELEY.EDU Lines: 65 Here follows a report of the magic that I've learned so far, and a plea for help on the remaining problems. I am still having problems with distinguishing between different types of executables (program, font, lib, dev, handler). BTW, although there is supposed to be a "moveq #0,d0 + rts" in non-program executables, this is far from dependable. In general I am disturbed that it would seem that I have to use trial-and-error, checking several possible data structures, in order to determine the file type. This implies that I might run into "coincidental" data that makes it ambiguous as to which type the file is. Perhaps worse yet, there are some exceptions to the rules that seem to make some files unidentifiable at all. Perhaps there is something about hunks that might straighten this out. Where are they documented??? The methods I've arrived at so far all depend on looking at what I believe is the hunk count to figure out where the first hunk is (my interpreation of hunks came from staring at hex dumps so this may be slightly off the mark): 000003f3 00000000 00000001 00000000 (magic #) (count) (followed by and a list of hunk offsets or sizes) Then I find the hunk code and a following size, followed by the contents of the hunk: 000003e9 000000a0 (code) (size) Now I look at the to find an indication as to file type: See whether is a struct DiskFontHeader, as indicated by 1) the dfh_DF.LN_TYPE component equals NT_FONT (0x0C) and (2) the dfh_FileID field equals DFH_ID (0x0f80) [ defined in libraries/diskfont.h ] If that fails, search an indeterminate amount forward for RT_MATCHWORD == 0x4AFC [ as defined in exec/resident.h ], and if found, interpret that as the first field of a struct Resident and see whether the rt_Type field is equal to NT_DEVICE (0x03) or NT_LIBRARY (0x09). If that fails, then it is either a regular program, a handler, or a non-conforming library or device, which is a very uncomfortable amount of ambiguity. The translator.library is not identifiable as a library by these methods, nor is the narrator device nor Matt's pipe device (BTW Perry's asdg.vdisk.device *is* identifiable). This leads me to believe that those authors did not follow the full standard, yet they got their devices/libraries to work anyway. How is that possible??? Finally, I can't find any method for distinguishing handlers from regular executables at all. Anyone know how? Thanks, Doug Merritt ucbvax!ingres!hatcher P.S. I still use uucp paths rather than domains because when I mail *out* using domains, they almost always fail, whereas paths are as reliable as they ever were. Don't blindly recommend that people use domain names; they are not yet as robust (especially on some systems) as they could be. Also there are still many, many, many, many, MANY systems that don't know how to talk to domains at all!!! Thus if you care about *receiving* letters, you will still sign with a *path* to idbinbinbifi