Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!mips!bridge2!jarthur!uci-ics!ucla-cs!wales From: wales@valeria.cs.ucla.edu (Rich Wales) Newsgroups: comp.binaries.ibm.pc.d Subject: Re: which archiver/compresser and encoder/decoder to use? Message-ID: <32208@shemp.CS.UCLA.EDU> Date: 23 Feb 90 00:06:18 GMT References: <522.25E37FEB@blkcat.fidonet.org> Sender: news@CS.UCLA.EDU Reply-To: wales@CS.UCLA.EDU (Rich Wales) Organization: UCLA CS Department, Los Angeles Lines: 48 In article <522.25E37FEB@blkcat.fidonet.org> Jerry.Andrews@f426.n109.z1.fidonet.org (Jerry Andrews) writes: RW> whereas UUENCODE uses the following character output RW> function -- RW> RW> #define ENC(c) (((c) & 077) + ' ') How do I apply this funtion to the source file to get an output file? Looks to me like, if this function is applied to each byte of the source file, you'd wipe all the hi bits, and not end up with a complete set of printable characters anyway... I wrote my article with the underlying assumption that the people who would read it had a thorough understanding of how UUENCODE works. Since this assumption may not have been fair, let me elaborate. UUENCODE takes sets of three 8-bit bytes, rearranges the bits into four 6-bit groups, and translates each of these 6-bit groups into a printa- ble ASCII character. Additionally, each line of output from UUENCODE starts with a one-byte count of the number of 8-bit bytes encoded in that line. The reason most lines in a UUENCODE file start with the capital letter "M" is that this letter corresponds to the number 45 in UUENCODE's scheme -- and most of the lines in a UUENCODE file contain 61 printable characters (the initial "M", plus 60 characters that correspond to 45 8-bit bytes). The "character output function" I described in my article is what UUEN- CODE uses to translate its 6-bit groups into printable characters. I should probably have added that some UUENCODE's output a grave accent (`) instead of a space -- as protection against some mail systems that strip trailing blanks from lines in messages. The only real difference between UUENCODE and XXENCODE is in the func- tion that translates 6-bit groups into printable characters. Some of the characters used by UUENCODE (punctuation marks such as the square brackets, curly braces, and backslash) get destroyed -- translated into question marks, or deleted altogether -- by many machines which use IBM's EBCDIC character set. This is why lots of people have been objecting to UUENCODE and asking for some other program to be used in its place. XXENCODE uses a different set of printable characters that aren't disturbed by any known ASCII/EBCDIC translation schemes in use on USENET. -- Rich Wales // UCLA Computer Science Department 3531 Boelter Hall // Los Angeles, CA 90024-1596 // +1 (213) 825-5683 "Then they hurl heavy objects. . . . And claw at you. . . ."