Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!unisoft!hoptoad!xanth!kent From: kent@xanth.UUCP (Kent Paul Dolan) Newsgroups: comp.sources.bugs,news.misc Subject: Re: sending source code Message-ID: <3249@xanth.UUCP> Date: Fri, 6-Nov-87 23:51:01 EST Article-I.D.: xanth.3249 Posted: Fri Nov 6 23:51:01 1987 Date-Received: Mon, 9-Nov-87 06:20:53 EST References: <631@louie.udel.EDU> <332@uvicctr.UUCP> <2566@umn-cs.UUCP> <467@srs.UUCP> Reply-To: kent@xanth.UUCP (Kent Paul Dolan) Organization: Old Dominion University, Norfolk Va. Lines: 63 Keywords: compress directly to printable ASCII Summary: Does anyone want me to do this? Xref: mnetor comp.sources.bugs:426 news.misc:1108 In article <467@srs.UUCP> dan@srs.UUCP (Dan Kegel) writes: >Roughly paraphrased: >> It is desirable to use a compressing archiver (like zoo or arc) >> to package groups of files for transmission. >> However, the resulting archives must be uuencoded (or btoa'd) before >> transmission on Usenet. >> Why not just make the compressing archiver output in a format >> suitable for direct transmission on Usenet? > >Sounds like a good idea to me. I've always disliked having to go >thru three steps (paste, uudecode, unarchive) to decode these postings; >it's work I'd just as soon have a program do for me. > >I think that more thought needs to be given to transmitting large >binary files over Usenet. People often distribute documentation and >source in this format, so the old objection "But executables don't belong >on Usenet" no longer applies. > >- Dan Kegel A couple of months ago, I took the algorithm from the June '87 CACM article "Arithmetic Coding for Data Compression", and recoded it under contract into FORTRAN 77. Due to stupidities on both sides (mostly mine, I'm afraid), I didn't get paid, the software didn't get used, and all copies were destroyed! However, I ended up owner of the neat algorithm additions I invented. I had made it work, efficiently, and with one nice wrinkle. The original CACM algorithm encoded output bits into 8 bit bytes for compressed data storage. I needed to send the resulting file across a smart communications line; i.e., the transmitted data had to be printable. The kermit escape encoding looked too expensive, so I made a little switch. It turns out that 95*95 (printable ASCII characters) is just a bit more than 2^13, so by encoding 6.5 bits of data per byte by doing the obvious mod, multiply and shift for each byte, I was able to compress into printable ASCII. The result was about a wash for executables; they were about as big transmitted (compressed into printable ASCII) as unencoded and uncompressed, which is probably at least as good as the present situation. Various kinds of text behaved quite a bit better. So, would it be worthwhile for me to rewrite this stuff in C, or would someone else like to go ahead and do this, given these hints? I, too, think that a one step process to do this would be better. I can do the compression and decompression routines, and put them on the net, if someone who does Unix systems stuff could go on from there and make them into a uuencode-/uudecode-like utility, which is probably beyond my skills. I assume that the CACM algorithm is publicly usable. Comments? Kent, the (totally weird) man from xanth. Running for president on a pound of caffeine, an ounce of sense, and a program of increased exploration and exploitation of space. Support your (probably non-existent - get busy!) local branch of the Birthright Party: "The birthright of mankind is the stars!" Hey, it's better than dwelling on your stock portfolio; at least here you've got a chance for a laugh or two. ;-) Yum! Eat them plastic chickens, brethren! Call me when I'm elected; 'til then, I'm going to take a nap.