Xref: utzoo comp.sources.d:2014 comp.binaries.ibm.pc.d:77 Path: utzoo!mnetor!uunet!husc6!cmcl2!rutgers!iuvax!bobmon From: bobmon@iuvax.cs.indiana.edu (RAMontante) Newsgroups: comp.sources.d,comp.binaries.ibm.pc.d Subject: Re: Standard for file transmission Message-ID: <8430@iuvax.cs.indiana.edu> Date: 4 May 88 14:45:49 GMT References: <292@cullsj.UUCP> <55@psuhcx.psu.edu> <537@csccat.UUCP> Reply-To: bobmon@iuvax.UUCP (RAMontante) Organization: Computer Science Dept., Indiana University Lines: 102 Keywords: protocol compression source Summary: basically an attack on compress cullsj.UUCP (Jeffrey C. Fried) writes, among other things: , , 1) COMPRESS is a text only compression routine. It will not now, or ever, , help in the compression of binary files. This statement made me shell out and run the following quick experiment: -rwxr-xr-x 1 bobmon 15360 Feb 27 01:22 pgen -rwxr-xr-x 1 bobmon 10116 May 4 08:46 pgen.Z -rwxr-xr-x 1 bobmon 14336 Feb 24 08:19 pom -rwxr-xr-x 1 bobmon 9945 May 4 08:47 pom.Z Pgen and pom are both executable files (compiled from 'c'). Granted, this is on a VAX machine, running the full-blown compress. My attempts to run compress on my 8088 box were frustrating, given its memory requirements, and I haven't seen enough '.Z' formatted files to be worth the hassle. But I would assume that if it runs at all on a smaller machine, it will produce the same results; unlike zoo and arc, it cannot choose one compression method over another. , 3) The source for ZOO, PKARC, and the others is NOT available. Therefore , we are at the whims of whomever is currently supporting (or not supporting) , them. Source for arc is, at least for some Unix boxes. Zoo source has been promised. Pkarc was originally written in 8088 assembler, not the friendliest source. , 4) COMPRESS works faster and better on text files then the ARC routines , because they use 12 bit compression, where 13-bit (and more) are possible , under even the PC for COMPRESS (i've tried it on ans AT-clone). I haven't seen source for compress, either. And the executables I've seen were enormous, and limited to 12-bit LZW on 8088's under MSDOS; just like zoo and arc (and pkarc's squash method is some sort of 13-bit LZW). I've never heard anyone claim responsibility for compress, while the authors of zoo, pkarc, and arc are named, revered, vilified, and flamed frequently. At least one of them is an active participant on the Usenet. (Plug: I think that's one strength of zoo, although Rahul might disagree :-) , 5) On the weak side, there is as yet, no CRC or checksum for any of these, Any of WHAT? Zoo and arc certainly have a CRC value. Compress is compress. Its Unix-origin philosophy says that separate functions should be done by separate routines with their outputs tied together by the operating system. I think this is at the heart of some of the debates here. The philosophy works fine on a big multitasking machine like a VAX (or a suitably equipped 680x0 or '386?), and the entire news mailer system is predicated on that principle -- the mailer just calls compress (EVERYbody has compress, right?) to pack things in for it; it doesn't worry about whether the result is correct, and neither does compress. It's up to you to aggregate your files with shar or something. This piece-at-a-time philosophy is weaker on something like my MSDOS 8088 box. There aren't multiple users all needing similar fundamental tools, there's just me. And I haven't the resources (memory or CPU cycles) to support lots of little pieces that work fine individually but need sophisticated glue to work together; MSDOS's simulation of pipes is pathetic. In such a situation an integrated package (viz., zoo or arc) makes a lot more sense. They can incorporate in a consistent manner all those little pieces that a system admin. may have put on a Unix box, but which I haven't yet found while rummaging around BBS's. By integrating everything a top-down design is possible, unlike what happens when you bend the problem to fit the tools you already have. , but adding it would be someithing i am willing to take responsibility , for should enough people decide they would like to take the approach , which i'm currently suggesting. At which point it will become yet another uncommon non-standard (like ARITH?). I don't think adding code will make it fit any better on small machines, and the big machines can afford to calculate a CRC with an external routine. Not to mention the question of what you DO with it... Is the CRC for compress's use? Then it becomes not-quite-compress. Is it for human use? Then how do I recreate it to find out if the file is still intact? ... , 5) LASTLY: I am not trying to criticize the ARC routines, rather i am trying , to offer an alternative which i feel will reduce the time for transmission , of files, as well as, providing us with portability. COMPRESS, ARITH, , UNSHAR and UUENCODE are all available at the source level. COMPRESS and , ARITH have been tried in at least three different environments: UNIX (BSD), , VMS and PC/MS-DOS. , Remember, for those of us who are NOT using the NET at the expense of a , university, the cost of communication, and therefore the time required , to transmit a file, are VERY important. I don't find 1200bps transmission to be a lot of fun to wait for, either... but I take it that your basic argument is that compress makes smaller archives than zoo or arc, which are therefore cheaper to transmit. I don't see that the compression improvement is as significant as you imply (and your statement about binary is completely at odds with all my experience). The other strengths of the integrated packages offer a LOT of functionality, some of which I would seek out even if there were no compression involved. The biggest problem I see is that many news mailers compress everything blindly, so that an already-compressed file gets bigger. This would also be true of a sufficiently random file, although I think most executables aren't that random. And this compress-and-be-damned behavior is not a strength of the system, it's a weakness. (Even compress will complain if its result is bigger than its original; does the mailer ignore this, or are the net.gods lying when they claim they're shipping bigger files because of the double compression?)