Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!moravian.EDU!nicholaA From: nicholaA@moravian.EDU (Andy Nicholas) Newsgroups: comp.sys.apple Subject: (none) Message-ID: <8905171107.AA18772@batman.moravian.edu> Date: 17 May 89 11:07:11 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 430 (This letter was originally sent to Floyd Zink to post in CIS and should appear there in a day or so...) Before I start this tome, let me state that: My #1 goal for shrinkit is to make things as small as possible. My #2 goal was to make shrinkit as easy to use as possible. With that in mind, read on -- The problem with Binary II as it now stands is that is not being used as it was intended. If you dig out the original transcipts of the CIS conferences on the standardization of a transfer format, you'll suddenly realize that Binary II was meant for one thing: to keep the attributes of ProDOS file intact while they are transmitted between a user and an online service such as CIS. As soon as BLU included SQueezing files into a Binary II archive, it superceded what Binary II was meant to do. (This letter was written to Floyd Zink) -- Binary II was not meant as an archive format. You know that yourself, which is why you designed your own archive format for ACU (which, btw, is very well thought out). I talked to Gary Little at Applefest for an extremely brief period on Sunday about this, and he seemed to agree that Binary II, as it is being used now, has gone beyond its original purpose. Binary II is excellent for its intended purpose. For an archive format, it is not so great. Now, this brings a quandrary to our present situation. Both NuFX and ACU properly store the attribute information of arcived files with their respective archive formats. Both archive formats perform the function of Binary II. Complicating this process were many telecommunications programs which automatically extract the contents of a Binary II file on-the-fly as the file is being received by the user. At the time this was a good thing. It is not a good thing now. Why? Because as soon as BLU included SQueezed files within a Binary II file the extraction process needed an extra step -- unsqueezing _after_ the file was downloaded. This extra step complicates things because it is 2 steps for the user. The simpler an operation is, the better. Because terminal programs can use Binary II, to simplify operations for the user, the designer of such a program is faced with 2 options: (1) Either remove (or shut-off) Binary II extractions, or... (2) Extract the complete contents of a Binary II file, no matter what that file may contain, compressed or uncompressed, by whatever algorithm. Option 2 is not generally feasible. Why? To currently unSQueeze the contents of a Binary II file which has SQueezed members would take up a large portion of the CPU time, effectively slowing down the transmission rate. Any decent terminal package will give priority to the user and transfer stuff as fast as possible out of respect for the user's needs. (do you like spending more money than you have to by slowing down your transmission time?) There are several compression algorithms that are so incredibly time-intensive that uncompressing on-the-fly cannot be done without machines that cost alot of money (and we don't own those). Off the top of my head, if you check the recent Japanese import LHARC and its algorithms, you'll find that it is incredibly slow, but compresses the best of any compression algorithm to date easily implemented on a microcomputer. This is the reason that the headers within a NuFX archive are not 128 byte aligned for ease of extraction by Xmodem programs. Some data compression algorithms are so incredibly time-intensive that it would be useless and confusing to have an Xmodem program extract records from a NuFX archive while it is downloading. If Binary II files were known to ONLY contain uncompressed data, then this problem would not exist, because then telecommunications programs could properly extract without slowing transmission time and everything would be right with the world. Almost. If Binary II files continue to contain compressed data, then I personally see option 1 as a possible solution. Don't use automatic Binary II extraction. ----------------------- A little history/detour ----------------------- Some people have thought that there was some sort of great "competition" between the format that ACU uses for storing files and the format ShrinkIt uses for storing files, NuFX. This is not true. Those of you who are now hollering about the fact that NuFX does not include a Binary II header on the front end of its files, consider this: You have the same problem with ACU files. Floyd told me this past winter that it was almost assured that GEnie, AppleLink PE, and CIS would all use ACU. If CIS *HAD* used ACU (aka BLU II), then you would be faced with the same problem you have today, trying to decide what to do about a standard archive format archive format. As it stands, on AppleLink PE, ACU does not attach a Binary II header to its files. Neither does ShrinkIt, and with good reason. I did not design NuFX hastily or rashly. I spent the bulk of almost an entire semester designing it and refining it, asking people's opinions and getting it to the point where I could finally live with what I'd written. Morgan Davis, the author of USI's MouseTalk telcom package, cautioned me against trying to take the archive format too far, trying to have it be all things to all people. Morgan's words were very wise, for he has much experience where I do not. Morgan was not the only person I consulted. I didn't have the ability to contact all the "big name" people on all the information services at the time because I didn't have a CIS or GEnie account. I talked to programmers within our Apple II community and asked their opinion. Many things were suggested, many things added, many things dropped. I also consulted with the most important people: users. I canvassed small bulletin board systems and listened to anyone who didn't know beans or care about archivers or file formats. I _did_ listen. At one point a suggestion was made to allow variable length headers for NuFX. That required rewriting almost 23 of the 26 pages of the NuFX documentation, but I did it because I knew the fellow who suggested the change was 100% correct. All in all, it took me about 3 months of real work to properly define NuFX. I don't claim NuFX is perfect. It's not. I don't claim NuFX provides the answer to every Apple II application's needs. It can't. I _do_ claim that as an archive format NuFX is more advanced than Binary II. ---------------------------------------- Why Binary II & NuFX should not be mixed ---------------------------------------- Now, back from that tangent -- I believe that forcing ShrinkIt to include a Binary II header in front of every NuFX archive will accomplish nothing but to confuse more people than would be confused otherwise. Why? First, because I hope that ShrinkIt doesn't become the sole archival program used by Apple II owners. I would hope to see a group of totally compatible programs which work alike but may have more/less features than each other. Having an open standard, like NuFX, excourages this type of growth. Second, because as it stands, NuFX is aesthetically correct. Adding a Binary II header to the archive detroys this continuity of design. It forces every subsequent NuFX archival program to have to deal with something extra (and believe me, if you've read the NuFX specs and you'd like to write an archiver for it, it's complex enough without adding something else to worry about) Yes, I included automatic Binary II extraction in ShrinkIt. Not every programmer does stuff like I do and it is unfair to expect additional work from them when conforming to the NuFX spec is hard enough. Third, if a Binary II header is added to the beginning of a NuFX archive, then the extractor of that Binary II file *MUST* be able to extract up to 256 separate NuFX archives from the Binary II file or that program will violate the Binary II standard. This is exceedingly hard to implement. Binary II is what it is today because no one has ever grossly violated the existing standard, and I submit that what you propose for ShrinkIt to do would force me to do just that. This is something I feel needs to be avoided. Binary II and NuFX should remain separate insofar as a single archival program is concerned. Fourth, like point #3, NuFX is what it is because since its final revision it has not changed grossly. A format which changes wildly every 6 months cannot claim to be a standard. As much as Binary II cannot change for NuFX, so to NuFX must not change for Binary II. This is what I originally intended for NuFX, and this is what I hope will continue. Fifth, because ShrinkIt, as it now stands, can handle what you suggest with an extra 3 keystrokes. If a Binary II header were attached to a file, ShrinkIt would extract the inner NuFX file properly and then the user would simply have to extract the resultant NuFX file. The result is EASIER if the file is transmitted WITHOUT a Binary II header at all. (ie, turn the Binary II auto-extract in your terminal program off and just transfer a NuFX file without the Binary II header). But! You say: the file's attributes will be lost! You won't be able to determine the filetype of the NuFX file! Um, not quite. The same thing happens to Binary II files when transferred between a user and an online service... ShrinkIt does not care what the filetype/auxtype combination of a file is. It depends on a signature at the beginning of the file to determine if a file is an archive or not. The file's attributes are kept safely within the NuFX archive. Sixth, because the structure of ShrinkIt as it now stands does not easily (not at all) allow for the type of extraction you folks would like to see added. It's not impossible, but it's also not very easy, either. This would be (for me) time better spent on something that could possibly earn a enough money for me to return to school in the fall. Folks who work with computers seem to often forget I'm human, I'm not a machine. All in all, it would be far easier for all those involved if NuFX files were uploaded without a Binary II header, and downloaded without a Binary II header. If they _are_ transferred that way, then ShrinkIt will find the Binary II file, extract its contents, and the user will have to decide what to do with the contents, just like he/she would have to do afterwards. If BLU were used to extract such a combination, then the user would be left with a file that ShrinkIt alone could unpack, subsequently adding confusion to the process. Wouldn't it be easier if the NuFX archive just didn't have any extra baggage attached anyway? -------------------------- We need to make a decision -------------------------- Ok, so where do we go from here? What do we do? Those of you with responsibility to your users have a tough decision to make. You have to decide whether or not to abandon Binary II and jump to NuFX, stick with Binary II and abandon NuFX, or perhaps take the middle road and do what the Macintosh users have been doing for quite some time now. I don't know about CIS, but on other major online services, StuffIt files are uploaded by themselves without any type of baggage. MacBinary II is used to maintain a file's attributes on (pardon, gary :-) point-to-point transfers, ie, from one small machine to another. The sending system adds the MacBinary II header and the receiving system strips the MacBinary II header away from the file, maintaining the file's attributes. I don't see why such a system can't be adopted by the Apple II community. It's intelligent, well thought out, and always leaves your option open. But you're suggesting we abandon Binary II! No, I'm not. At least not altogether. I'm suggesting that an adequate solution would be to encourage the usage of ShrinkIt vs Binary II and most of your problems about confusion will go away. ---------------------- But why change at all? ---------------------- First, re-read the NuFX documentation. In the very beginning I state some very valid reasons why Binary II will not function as an archive standard. (Now is when I get to toot my own horn. I try not to do this too often lest people somehow think I'm stuck up or have some sort of ego problem. In this case, I'm forced to make an exception to prove a point and to prevent a catastrophe). ShrinkIt is faster. How fast is it? It's roughly 4X faster than BLU or ACU. ShrinkIt is twice as fast as Macintosh StuffIt! And that's only the 8-bit version! I hope to have the 16-bit version 1/3 faster yet. ShrinkIt makes things smaller. How much smaller? ShrinkIt uses Dynamic 12-bit LZW, which is a better compression algorithm than Huffman SQueeze in _most_ cases. Most cases will give a 3-10% reduction over Huffman SQ. It achieves its speed through the use of Run-Length-Encoding (RLE) on the data it must compress. RLE on the chunk of data which ShrinkIt has to work on reduces the amount of time that the time-intensive LZW algorithm must work. Because of the way that the RLE/LZW is implemented, it's almost impossible for a file to GROW (as is common with other archiving methods). With SQ it's possible for a section of the compressed file to grow and for other parts to shrink. The resultant compression is the average of what grows and what shrinks. With ShrinkIt, nothing grows, so the effective compression is that much better. ShrinkIt is more flexible. It is only more flexible because it follows a well thought out, consistent standard, NuFX. NuFX allows the addition and deletion of record within an archive any time a user wants. NuFX allows for resources and other entities such as messages (text, graphics, your imagination is the only limit) to be included with a record. NuFX allows for different programs to "mix and match" different size records within an archive. NuFX allows for DIFFERENT FILING SYSTEMS to be easily recognized and stored. NuFX Allows for over 4,000,000,000 records within an archive. NuFX allows for files of over 4,000,000,000 characters in both the resource, the data, and the optional message. NuFX allows for error checking of its headers. NuFX allows the inclusion of when a file was placed INTO an archive, when it was created, and when it was last modified. NuFX keeps track (internally) of when the entire archive was created and when it was last modified. NuFX allows for file system dependent information to be translated between filing systems. NuFX allows for entire disks to be placed in an archive and mixed with other files. NuFX can eventually have its own scripting language in which a script could tell an archiver a specific set of instructions to install a revision of a given program on a user's disk. NuFX is expandable. The headers have a variable length, so as long as we always use bits & bytes to store information and have files under 4 gigabytes in length, we'll be safe. Since the headers are expandable, you can never "run out of room." NuFX allows for a more detailed picture of what type of data is in an archive by providing more information, such as a file's compressed size and uncompressed size. Alright, NuFX can handle this stuff, but can ShrinkIt? Yes... at least most of it. :-) Conforming to the NuFX standard is not an easy thing to do. It's very complex. It's very complex for a reason. NuFX is very hard to implement because it's so flexible. What an archiver is incapable of handling it can't simply ignore. ShrinkIt can add files to archives, archive entire disks, mix disks with files, selectively extract the contents of archives, list the contents of archives, and allow up to 60,000 files to be placed in an archive (see, that's the current catch), copy files, type the contents of text and AppleWorks and WordPerfect files, format disks, erase disks, zero the unused blocks on disks so disks compress smaller, create directories, and catalog stuff. ShrinkIt is easier to use. ShrinkIt's user interface was designed as closely to the Apple Human Interface guidelines as was possible on an 8-bit machine including all the features I did. ShrinkIt currently lacks mouse support. This is being worked on in both the IIgs version and a special IIe mouse and pulldown menus. ************* Important! ************* Shrinkit is COMPATIBLE with the past. ShrinkIt can extract and unSQueeze the contents of Binary II files, NuFX archives, ACU archives, and .BQY files. It does this transparently as far as the user is concerned. The user never has to know the type of the file. ShrinkIt is intelligent enough to do that for them and adjust itself accordingly. ShrinkIt has a future. -- Right now (well, not now right before finals anyway... :) I'm working on a IIgs desktop version of ShrinkIt which promises to be fully backwards compatible with the IIe version, archive and extract around 1/3 faster, properly archive and extract GS/OS resources, use of extended IIgs memory when archiving/extracting, will hopefully include a hypertext help system, and will just be tremendously easier to use. I'm currently also looking into alternate algorithms which give better compression. The best I've run across is that which is in LHARC, the source to which is publicly available. Because LHARC requires relatively little memory, it may be possible for me to squeeze it into GS/ShrinkIT *AND* IIe ShrinkIt. The compression performance is astounding, but the time to compress is awful. (The speed to decompress is many times faster). I'll be writing a set of utilities for NuFX files sometime this summer. I can't yet disclose everything they'll do, but it will make dealing with these types of archives a little easier for the IIe/IIc owner. I also have a more secret project of sorts which involves shrinkit that should make alot of people very happy. I'm not at liberty to say what it is just yet, but it probably won't be sold commercially (which should make even more people happy). Several people have suggested that I add ARC, ZIP, ZOO, PKARC, and LHARC support to ShrinkIt. Seeing as how I only have 2 hands (at least I used to until typing this thing... not so sure any more.. :-) I'll probably not be able to add everything everyone wants. I have to take it one step at a time. ------------------------ But wait! There's more! ------------------------ Now that I sound like a ginsu commercial, I might as well confound your problems by telling you that future versions of ShrinkIt will be able to handle self-decompressing files. What's a self-decompressing file? A self-decompressing file will decompress data within its own file once it's executed. ShrinkIt's SDFs will have a step ahead of other methods because you'll be able to selectively extract the contents of the archive if you use ShrinkIt (IIe, IIgs, or ][+) on the SDF. (Otherwise, you just execute the SDF and it'll extract and then execute itself). ******* Self decompressing files take the 'trauma' out of using an archive utility for the first time novice user. It allows applications that would otherwise be too big for a single disk to be packaged that way and installed on a hard drive automatically simply by running the application. But, to an online service such as CIS or GEnie, they pose a large problem because it not only adds extra space to the file, but is somewhat redundant. (because the file must maintain it's filetype to execute, it too must be placed in an archive. Why do it twice? That's idiotic.) ****** For this reason, I would suggest that these types of archives be banned, in most situations, from major online services. I'm not kidding, either. ****** On a IIe/IIc, you'll be limited to an archive of about 34-36k, but you'll be able to include an entire ARCHIVE full of files, not just a single file as is done in some other archivers. On a IIgs, the size of the file will only be limited by your memory size. I've begun to take steps to make sure that Apple knows of my intentions and provides a method for identifying these files accordingly. Because circumstances where SDF's are useful are not that common, the actual SDF 'maker' will be included in the companion utilities. Binary II doesn't allow for this. ACU can't do this. ProPacker can't do this. All those versions of that awful DDD can't do this. Macintosh StuffIt can, and so will ShrinkIt. ----------------------- Hallelujah! He's done! ----------------------- Yes, I'm finally finished, but I can't let you get away before answering a few more questions... ;-) Why did you write ShrinkIt? What did you hope to accomplish? To tell you the truth, I wrote it originally hoping that my terminal programs would be the only terminal programs to extract files on-the-fly including data compression. That hope quickly faded as I got deeper and deeper into the reserach involved. I also realized that I needed a standard method of archiving the stuff so the whole scheme would work. That's where everything went haywire and I ended up with what you see in ShrinkIt 2.02 and the Apple ][+ 1.3 versions of ShrinkIt. In the future, I would hope that if another archival method/archiver is ever brought forward that the major online services would *NOT* do what they initially did with ShrinkIt, banning it, not investigating it, and generally ignoring it... somehow hoping it would go away. Everything would have been fine except for GS/OS and its introduction of resource forks and FSTs, instantly rendering the current Binary II obsolete. I hope that whoever comes after me doesn't have to go through the rigamarole I've had to go through to see Shrinkit through till this point. I'd certainly hope that the people with responsibility on these services would take a sharp look at whatever comes 'after.' Investigate it. Talk about it. Debate it. Make it prove itself and its worth to our Community. Just as ShrinkIt has. And will. Sincerely, Andy Nicholas