Path: utzoo!utgpu!water!watmath!clyde!rutgers!sdcsvax!ucbvax!unisoft!gethen!farren From: farren@gethen.UUCP (Michael J. Farren) Newsgroups: comp.sys.amiga Subject: Re: IFF archive proposal Message-ID: <589@gethen.UUCP> Date: 16 Jan 88 05:22:35 GMT References: <6173@j.cc.purdue.edu> Reply-To: farren@gethen.UUCP (Michael J. Farren) Organization: There's Unix there in Oakland Lines: 77 In article <6173@j.cc.purdue.edu> ain@j.cc.purdue.edu (Patrick White) writes: > >FORM ARC - Archive. > The ARC form is a form for collecting more than one file into one file. >It can also specify subdirectories to be created before it is unarced, and >it can contain nested FORM ARCs as well as FLSTs. Subdirectories should probably always be handled as nested ARCs, as that will allow the de-arcing utility to determine whether or not to un-arc them. There are times when you don't want to unarc all sub- directories, but need finer control. > SBDR - Subdirectory. This chunk contains a string of characters >terminated by a null, specifying a subdirectory for this ARC to be unarced >into. If the specified subdirectory does not already exist, the unarcing >program will create it in the current directory (or the directory that a >parent ARC was unarced into). Again, should be OPTIONAL. You might want to unarc into your current subdirectory, regardless of where the file originally came from. > ANAM - Archiver name. This contains a null-terminated string telling >the name of the program that created this archive. This chunk, if included >at all, should only be included in the top-level ARC chunk. Not necessary if you agree on archival formats ahead of time. Why would you want to (or ever need to) know the name of the program that created an archive, if the archive were in a standard format? > This LEVL - Multilevel marker. This chunk contains one word of data - >the number of compression levels in the main chunk. For example, an >archiver may detect that a certain file would be much better off if it was >crunched and then squeezed. Almost never needed, and the overhead involved in figuring out that multiple levels of compression could result in some savings will, very likely, overwhelm the advantages of doing the multi-compression. I would tend to reject this as an unnecessary complication, with little reward. > PACK - Packing. > CRNC - Crunching. > SQEZ - Squeezing. > SQSH - Squashing. Instead of this, how about one: FRMT, which would contain one byte indicating the packing algorithm used to compress the file chunk, immediately followed by the file data in compressed form, perhaps called DATA or BODY or whatever works. This would allow future expansion more easily. Basically, I think this is a real nifty idea. One thing to keep in mind is the many methods of compression out there. There's run-length encoding, which is the method currently used for ILBM compressed images. There's Huffman encoding, commonly called 'Squeezing'. And there's Lempel-Zev encoding, which is what is used for Crunching and Squashing, in all of its different bitlength flavors (12-bit LZ encoding, for example, is the basic encoding scheme used for Crunching, while Squashing is the same encoding scheme, only 13 bits instead of 12). This is why I recommend the FRMT chunk, since this will allow the use of an arbitrary number of different schemes, if you like, and will also allow future expansion of the encoding schemes without requiring you to think of separate names for each of them. (ARC, by the way, uses eight different schemes internally). Also, it should be the province of the program which creates the ARC to choose which schemes it prefers. There are several special purpose applications which might always prefer one type of encoding over the other, and this should be allowed. I can see two types of programs coming out of this: those that are specially designed to manipulate archives, and those that just use them as they see fit. Neat idea. -- Michael J. Farren | "INVESTIGATE your point of view, don't just {ucbvax, uunet, hoptoad}! | dogmatize it! Reflect on it and re-evaluate unisoft!gethen!farren | it. You may want to change your mind someday." gethen!farren@lll-winken.llnl.gov ----- Tom Reingold, from alt.flame