Xref: utzoo alt.sources.d:654 comp.sources.d:5596 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uwm.edu!bionet!snorkelwacker!bloom-beacon!eru!luth!sunic!dkuug!freja.diku.dk!skinfaxe.diku.dk!thorinn From: thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) Newsgroups: alt.sources.d,comp.sources.d Subject: Re: Unnecessary tar-compress-uuencodes Message-ID: <1990Jul12.220553.9482@diku.dk> Date: 12 Jul 90 22:05:53 GMT References: <15652@bfmny0.BFM.COM> <1990Jul10.182546.26487@diku.dk> <987@galaxia.Newport.RI.US> Sender: news@diku.dk (The Netnews System) Organization: Department Of Computer Science, University Of Copenhagen Lines: 63 dave@galaxia.Newport.RI.US (David H. Brierley) writes: >In article <1990Jul10.182546.26487@diku.dk> thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) writes: >> name size crummy ASCII graphics >> ---------- ------- --------------------- >> tar 4718592 tar ------- -60.3% ------> tar.Z >> tar.Z 1874378 +37.8% +37.8% >> tar.Z.uu.Z 2229065 tar.uu.Z ------- -6.8% ------> tar.Z.uu.Z >1) The compressed-uuencoded-compressed file is almost 20% larger than the >compressed file, therefore you have *increased* my phone bills by 20%. I >do not exactly appreciate this. 1) As I wrote, IF you have to post uuencoded material, it should probably be compressed first. I also wrote that I agree with all the other reasons the original poster gave to AVOID posting uuencoded stuff. I'm not advocating that people waste your bandwith by uuencoding stuff, I'm trying to prevent a mistaken argument from making people always post uuencoded stuff non-compressed --- because that often uses even more bandwith, and almost always uses much more disk space. Compressing before uuencoding often saves 60% on disk and 5-10% on the wire --- but sometimes it will only save ~5% on disk and _waste_ ~20% on a compressed link (some Sun run-length-encoded rasterfiles behave that way). The poster should try to find out how each of his files behaves, and pack each of them in the cheapest way; as ``bandwith on compressed links'' seems to be the most popular cost metric, cheapest probably means ``smallest after compression''. And then make a shar archive of the packed files, so people can decide which they want to unpack. Another problem with this: The result of compressing a single file may be very misleading when we really want to know how much larger it makes a compressed batch of news articles. Compress is a very stateful representation, and in a given batch it may not be able to compress a uuencoded file nearly as much as when taken alone. So even the worst rasterfile example may not affect the size of a batch as much as the numbers lead one to believe. (Normally, compress gets ~13% after any uuencode; in these examples, it gets ~30% after uuencode, but only the usual ~13% after compress-uuencode. In the middle of a batch, the difference might shrink a lot --- possibly to the point where compress-uuencode wins again because it starts out 5% smaller.) 2) I hope you realize that a tar achive has binary file headers and cannot be posted without some sort of encoding, so your 20% are not immediately applicable. However, anybody who uuencodes something which would have got through news as well without encoding deserves your scorn and anger (and in my opinion, this includes anybody who posts a tar archive consisting of ASCII files). And I don't understand why ASCII/EBCDIC problems should be an excuse for uuencode, either. The format uses the ASCII characters '!', '[' and ']', which are among those I've most often seen altered in ASCII->EBCDIC->ASCII translations. If a uuencoded file gets through unscathed, odds are that any printable ASCII file would. But maybe somebody wrote a uudecode which takes input in EBCDIC and outputs in ASCII? -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk