Xref: utzoo comp.sources.d:3754 alt.sources:701 Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!csd4.milw.wisc.edu!leah!rpi!crdgw1!ge-dab!nbc1!philabs!parrot!per From: per@parrot.Philips.Com (Paul E. Rutter) Newsgroups: comp.sources.d,alt.sources Subject: "btoa Classic", "tarmailchunky", and comments on "new btoa" Keywords: binary-to-ASCII, tarmail, tarmailchunky Message-ID: <55590@philabs.Philips.Com> Date: 13 Jun 89 21:05:51 GMT Sender: news@philabs.Philips.Com Lines: 651 This posting contains commentary, and shar source for the btoa-tarmail package. It has come to my attention that a "new" version of btoa (binary to ASCII) from Stefan Parmark d84sp@efd.lth.se is making the rounds. As the original author of btoa, I have a few comments about it: 1) I hope people find the new version useful in their own work. I CAN imagine (non-unix) situations where somebody might benefit from the repair features. I am sure the author's intention and code are entirely well meaning. However... 2) I object strenuously to the new version being called "btoa", as it changes a lot of things, and although it can read and write the "old" version, it defaults to writing the "new" way, which I AM SURE WILL CAUSE UNNECESSARY CONFUSION FOR CURRENT USERS of btoa. (I would not be posting this "clarification" if a different name, say "mailcoder" had been used). 3) Personally, I could not benefit from the "features" of the new version. I will continue to distribute the original version, which is identical to that distributed with "compress", and has been widely and successfully used for years now. ------------- I know the way the net works, and I know that this posting will lead to more postings and more mail. I intend to post this now, and respond no further to anyone. I have better things to do. Each person who cares about this sort of net minutiae will have to decide what they want to do for themselves (there are lots of "mailcoders" out there). Since the first and only time I heard about the entirely rewritten "new" version (where I am still "nicely" listed as first author) was from a friend forwarding the posting in comp.sources.unix to me -- I will act similarly by (ab)using this news group. (In fairness, it is of course entirely possible that mail was sent to me long ago and I never got it). A little history: I wrote btoa/atob as an alternative to uuencode. While btoa is a bit more efficient than uuencode, my real reason for writing it in the first place was a dislike for uuencode/uudecode doing two things at once: it serves as a coding/decoding filter, AND it insists on creating a file with a specific name, owner, and mode. This violation of the philosophy "do one thing" often led to frustrations with "permission denied". So, I specifically wrote btoa/atob to be simple, optionless filters that only did encoding/decoding. When I posted the source to the net years ago, I included two very simple shell scripts: "tarmail" and "untarmail", that just piped tar to btoa to mail. Soon after, the authors of the excellent "compress" program wanted to bundle btoa in their distribution, and the pipeline in the tarmail script became: tar cvf - $* | compress | btoa | mail After one early patch to get around a "feature" of bitnet (bitnet did weird things to blank lines), the only problem since has been occasional claims of bugs that have always turned out to be caused by other people "improving" the program and passing it on. Mr. Parmark says in his readme: > Btoa is in the public domain. You may use it, give it away, and > make improvements, as long as the names of the developers are > mentioned and you don't use it to earn money. It may NOT be used > commercially without my permission. As the original author I hereby give permission to use my stuff in anyway you want, even commercially, but PLEASE, issue any improvement or change under other program names, and without my name. About the "new" version, I note in passing from the "new" man page: > KNOWN BUGS > Btoa will not work properly unless the input is a true file > or a redirected one. This is because file positions are col- > lected during diagnosis for later reference when producing > the diagnosis file. The bug is actually in fseek() which > only can reposition 'real' files. I do not consider this to be a "bug in fseek" (the return value of which is not checked in the "new" source code for an error return). fseek will never be able to seek arbitrarily on a true pipe, and since shells like "tarmail" do use pipes -- good luck. There is also the comment that: > I removed the feature to exit with no output if there was an error in the > archive. ... I hope all realize that you shouldn't run a file that was > created from a corrupted archive. Well, the reason I did it that way is that most people using a script like tarmail DO NOT understand that sort of thing in the least. If I was going to write another binary-to-mail encoder (and I am not!), I would probably do a few things different. But I sure would not call the new one "btoa". (Actually, I wish people would spend their time working on X.400 mail, so btoa and uuencode would become obsolete...) --------------------- (clarification off) ---------------------------------- Even though this is comp.sources.d, my original source to btoa is short enough that I will violate net protocol and put it here now (rsalz is welcome to put this part in comp.sources.unix if he sees fit). (Indeed, if one removes whitespace, the source to the decoding program "atob" is short enough that some people have scripts that tack it on to the front of their tarmail, enabling "self-decoding".) As a bonus for those who have waded this far, there is a new "tarmailchunky" script (from Mark Baushke, thanks) that will help when sending large tars though machines with low mail size limits. I did not want to build this chunky feature into the original "tarmail" script, as I do not want to further encourage brain damaged 64K limits; for example, between machines equipped with 19200 baud modems, it is a costly mistake to arbitrarily break mail into a whole bunch of < 1 minute phone calls. There is an "untarmailchunky" script that puts the pieces back together without you having to strip off headers (however it is up to you to feed it all the pieces in the right order). Fancier scripts for chunking can and have been written -- be my guest. If you are already using "bota Classic": Nothing other than the "chunky" scripts (and related changes to the man page), has been added in the following shar package. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ # The rest of this file is a shell script which will extract: # Makefile btoa.c atob.c tarmail untarmail tarmailchunky untarmailchunky btoa.man # Suggested restore procedure: # Edit off anything above these comment lines, # save this file in an empty directory, # then say: sh < file echo x - Makefile cat >Makefile <<'!Funky!Stuff!' # makefile for btoa/atob/tarmail # Paul E. Rutter # per@philabs.philips.com # # do whatever you want with these programs, but please do not make any # changes and distribute "new" versions under the same program names. # # You need to make BINDIR, SHELLDIR, and MANDIR correct for your situation BINDIR = /usr/local/bin SHELLDIR = /usr/local/bin MANDIR = /usr/man/manl CC = cc -O BINS = btoa atob SHELLS = tarmail untarmail tarmailchunky untarmailchunky MANS = btoa.man install: clean man all strip $(BINS) chmod 755 $(BINS) $(SHELLS) cp -p $(BINS) $(BINDIR) cp -p $(SHELLS) $(SHELLDIR) make clean all: $(BINS) $(SHELLS) man: $(MANS) chmod 644 $(MANS) cp -p btoa.man $(MANDIR)/btoa.l cp -p btoa.man $(MANDIR)/tarmail.l btoa: btoa.c $(CC) -o btoa btoa.c atob: atob.c $(CC) -o atob atob.c shar: Makefile btoa.c atob.c $(SHELLS) $(MANS) shar btoa.shar Makefile btoa.c atob.c $(SHELLS) $(MANS) clean: rm -f $(BINS) btoa.shar !Funky!Stuff! echo x - btoa.c cat >btoa.c <<'!Funky!Stuff!' /* btoa: version 4.0 * stream filter to change 8 bit bytes into printable ascii * computes the number of bytes, and three kinds of simple checksums * incoming bytes are collected into 32-bit words, then printed in base 85 * exp(85,5) > exp(2,32) * the ASCII characters used are between '!' and 'u' * 'z' encodes 32-bit zero; 'x' is used to mark the end of encoded data. * * do whatever you want with these programs, but PLEASE do not make any * changes and distribute "new" versions under the same program names. * * Paul Rutter Joe Orost */ #include #define reg register #define MAXPERLINE 78 long int Ceor = 0; long int Csum = 0; long int Crot = 0; long int ccount = 0; long int bcount = 0; long int word; #define EN(c) (int) ((c) + '!') encode(c) reg c; { Ceor ^= c; Csum += c; Csum += 1; if ((Crot & 0x80000000)) { Crot <<= 1; Crot += 1; } else { Crot <<= 1; } Crot += c; word <<= 8; word |= c; if (bcount == 3) { wordout(word); bcount = 0; } else { bcount += 1; } } wordout(word) reg long int word; { if (word == 0) { charout('z'); } else { reg int tmp = 0; if (word < 0) { /* Because some don't support unsigned long */ tmp = 32; word = word - (long)(85L * 85 * 85 * 85 * 32); } if (word < 0) { tmp = 64; word = word - (long)(85L * 85 * 85 * 85 * 32); } charout(EN((word / (long)(85L * 85 * 85 * 85)) + tmp)); word %= (long)(85L * 85 * 85 * 85); charout(EN(word / (85L * 85 * 85))); word %= (85L * 85 * 85); charout(EN(word / (85L * 85))); word %= (85L * 85); charout(EN(word / 85)); word %= 85; charout(EN(word)); } } charout(c) { putchar(c); ccount += 1; if (ccount == MAXPERLINE) { putchar('\n'); ccount = 0; } } main(argc,argv) char **argv; { reg c; reg long int n; if (argc != 1) { fprintf(stderr,"bad args to %s\n", argv[0]); exit(2); } printf("xbtoa Begin\n"); n = 0; while ((c = getchar()) != EOF) { encode(c); n += 1; } while (bcount != 0) { encode(0); } /* n is written twice as crude cross check*/ if (ccount == 0) /* ccount == 0 means '\n' just written in charout() */ ; /* this avoids bug in BITNET, which changes blank line to spaces */ else putchar('\n'); printf("xbtoa End N %ld %lx E %lx S %lx R %lx\n", n, n, Ceor, Csum, Crot); exit(0); } !Funky!Stuff! echo x - atob.c cat >atob.c <<'!Funky!Stuff!' /* atob * stream filter to change printable ascii from "btoa" back into 8 bit bytes * if bad chars, or Csums do not match: exit(1) [and NO output] * * do whatever you want with these programs, but PLEASE do not make any * changes and distribute "new" versions under the same program names. * * Paul Rutter Joe Orost */ #include #define reg register #define streq(s0, s1) strcmp(s0, s1) == 0 #define times85(x) ((((((x<<2)+x)<<2)+x)<<2)+x) long int Ceor = 0; long int Csum = 0; long int Crot = 0; long int word = 0; long int bcount = 0; fatal() { fprintf(stderr, "bad format or Csum to atob\n"); exit(1); } #define DE(c) ((c) - '!') decode(c) reg c; { if (c == 'z') { if (bcount != 0) { fatal(); } else { byteout(0); byteout(0); byteout(0); byteout(0); } } else if ((c >= '!') && (c < ('!' + 85))) { if (bcount == 0) { word = DE(c); ++bcount; } else if (bcount < 4) { word = times85(word); word += DE(c); ++bcount; } else { word = times85(word) + DE(c); byteout((int)((word >> 24) & 255)); byteout((int)((word >> 16) & 255)); byteout((int)((word >> 8) & 255)); byteout((int)(word & 255)); word = 0; bcount = 0; } } else { fatal(); } } FILE *tmp_file; byteout(c) reg c; { Ceor ^= c; Csum += c; Csum += 1; if ((Crot & 0x80000000)) { Crot <<= 1; Crot += 1; } else { Crot <<= 1; } Crot += c; putc(c, tmp_file); } main(argc, argv) char **argv; { reg c; reg long int i; char tmp_name[100]; char buf[100]; long int n1, n2, oeor, osum, orot; if (argc != 1) { fprintf(stderr,"bad args to %s\n", argv[0]); exit(2); } sprintf(tmp_name, "/usr/tmp/atob.%x", getpid()); tmp_file = fopen(tmp_name, "w+"); if (tmp_file == NULL) { fatal(); } /* Make file disappear */ if (unlink(tmp_name) == -1) { fatal(); } /*search for header line*/ for (;;) { if (fgets(buf, sizeof buf, stdin) == NULL) { fatal(); } if (streq(buf, "xbtoa Begin\n")) { break; } } while ((c = getchar()) != EOF) { if (c == '\n') { continue; } else if (c == 'x') { break; } else { decode(c); } } if (scanf("btoa End N %ld %lx E %lx S %lx R %lx\n", &n1, &n2, &oeor, &osum, &orot) != 5) { fatal(); } if ((n1 != n2) || (oeor != Ceor) || (osum != Csum) || (orot != Crot)) { fatal(); } else { /* Now that we know everything is OK, copy tmp file to stdout */ if (fseek(tmp_file, 0L, 0) == -1) { fatal(); } for (i = n1; --i >= 0;) { putchar(getc(tmp_file)); } } exit(0); } !Funky!Stuff! echo x - tarmail cat >tarmail <<'!Funky!Stuff!' #!/bin/sh if test $# -lt 3; then echo "Usage: tarmail mailpath \"subject-string\" directory-or-file(s)" exit else mailpath=$1 echo "mailpath = $mailpath" shift subject="$1" echo "subject-string = $subject" shift echo files = $* tar cvf - $* | compress | btoa | Mail -s "$subject" $mailpath fi !Funky!Stuff! echo x - untarmail cat >untarmail <<'!Funky!Stuff!' #!/bin/sh if test $# -ge 1; then atob < $1 | uncompress | tar xvpf - mv $1 /tmp/$1.$$ echo tarmail file moved to: /tmp/$1.$$ else atob | uncompress | tar xvpf - fi !Funky!Stuff! echo x - tarmailchunky cat >tarmailchunky <<'!Funky!Stuff!' #!/bin/sh # "tarmailchunky" takes a file or list of files and creates a "tar file" it # then compresses this data (using compress) and converts it to an ascii # form (using btoa). If it is "too large" to fit into typical mail # transport systems (some uucp sites break at 64K bytes), it will split # the image into multiple parts and send them using the standard "mail" # command. if test $# -lt 3; then echo "Usage: tarmailchunky mailpath \"subject-string\" directory-or-file(s)" echo echo "tarmailchunky is a shell script that uses tar, compress, btoa, and split" echo "to send arbitrary hierarchies by mail. It sends things as one or" echo "more < 64K pieces. (see shell script to change this size)." exit else mailpath=$1 echo "mailpath = $mailpath" shift subject="$1" echo "subject-string = $subject" shift echo files = $* tar cvf - $* | compress | btoa | split -750 - /tmp/tm$$ n=1 set /tmp/tm$$* for f do { echo '---start beef' cat $f echo '---end beef' } | Mail -s "$subject - part $n of $#" $mailpath echo "part $n of $# sent (" `wc -c < $f` "bytes)" n=`expr $n + 1` done rm /tmp/tm$$* fi !Funky!Stuff! echo x - untarmailchunky cat >untarmailchunky <<'!Funky!Stuff!' #!/bin/sh # "untarmailchunky" takes a an ordered list of mail messages (if they were in # multiple parts, the must be fed to untarmail in order) and recreates # the data stored by the original "tarmail" reversing each step along # the way. if test $# -ge 1; then sed '/^---end beef/,/^---start beef/d' $* | atob | uncompress | tar xvpf - echo remember to remove the tarmail files: $* else sed '/^---end beef/,/^---start beef/d' | atob | uncompress | tar xvpf - fi !Funky!Stuff! echo x - btoa.man cat >btoa.man <<'!Funky!Stuff!' .TH BTOA 1 local .SH NAME btoa, atob, tarmail, untarmail, tarmailchunky, untarmailchunky \- encode/decode binary to printable ASCII .SH SYNOPSIS .B btoa < anything > ASCII .br .B atob < btoafile > anything .br .B tarmail subject-string who files ... .br .B untarmail [ file ] .br .B tarmailchunky subject-string who files ... .br .B untarmailchunky [ file ] .SH DESCRIPTION .I btoa is a filter that reads anything from the standard input, and encodes it into printable ASCII on the standard output. It also attaches a header and checksum information used by the reverse filter .I atob to find the start of the data and to check integrity. .PP .I atob reads an encoded file, strips off any leading and trailing lines added by mailers, and recreates a copy of the original file on the standard output. .I atob gives NO output (and exits with an error message) if its input is garbage or the checksums do not check. (The checksum is at the end; giving no output on checksum error guarantees that no "partial things" will be created by pipe scripts like untarmail if there was an error in transit). .PP .I tarmail is a shell script that tar's up all the given files, pipes them through .IR compress "," .IR btoa "," and mails them to the given person. For example: .PP .in 1i tarmail ralph "two files for you" foo.c a.out .in -1i .PP Will package up files "foo.c" and "a.out" and mail them to "ralph", with a mail subject line of "two files for you". .PP .I tarmail with no arguments will print a short message reminding you what the required args are. When the mail is received at the other end, that person should use mail to save the message in some temporary file name (say "xx"). Then, executing .PP .in 1i untarmail xx .in -1i .PP will decode the message and untar it. (In general, you will want to be in an empty directory, or the "right" directory when you execute this, since the "untar" will be creating new files). .I untarmail can also be used as a filter. By using .IR tarmail "," binary files and entire directory structures can be easily transmitted between machines. Naturally, you should understand what tar itself does before you use .IR tarmail "." .PP .I tarmailchunky is a shell script similar to tarmail, but it uses split to break the message into one or more pieces, each less than 64 Kbytes long. Use it when faced with mail size limits. On the receiving end, save the pieces as "xx.01", "xx.02", ... as they come in (they are numbered for you in the subject line by tarmailchunky). Then, use .PP .in 1i untarmailchunky xx.?? .in -1i .PP to decode the message and untar it. untarmailchunky uses sed to strip off mail headers and trailers on the pieces, so you do not have to do that manually. You DO have to give it the files in numerical order. .PP Other uses for btoa: .PP compress < secrets | crypt | btoa | mail ralph .PP will mail the encrypted contents of the file "secrets" to ralph. If ralph knows the encryption key, he can decode it by saving the mail (say in "xx"), and then running: .PP atob < xx | crypt | uncompress .PP (crypt requests the key from the terminal, and the "secrets" come out on the terminal). .SH AUTHOR Paul Rutter (with thanks to Joe Orost and Mark Baushke) .SH FEATURES .I btoa uses a compact base-85 encoding so that 4 bytes are encoded into 5 characters (file is expanded by 25%). As a special case, 32-bit zero is encoded as one character. This encoding produces less output than .IR uuencode "(1)." .SH NOTE The source for btoa is freely available. Use it any way you want, but please do not distribute changed versions under these program names. .SH "SEE ALSO" compress(1), crypt(1), uuencode(1), mail(1), split(1), sed(1) !Funky!Stuff! Paul Rutter Philips Labs per@philabs.philips.com uunet!philabs!per