Xref: utzoo alt.sources.d:651 comp.sources.d:5593 Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uwm.edu!rpi!uupsi!sunic!lth.se!newsuser From: bengtl@maths.lth.se (Bengt Larsson) Newsgroups: alt.sources.d,comp.sources.d Subject: Re: Unnecessary tar-compress-uuencodes Message-ID: <1990Jul13.022224.25441@lth.se> Date: 13 Jul 90 02:22:24 GMT References: <15652@bfmny0.BFM.COM> <3114@psueea.UUCP> <1990Jul10.203015.27282@eci386.uucp> <5256@plains.UUCP> <3124@psueea.UUCP> Sender: newsuser@lth.se (LTH network news server) Reply-To: bengtl@maths.lth.se (Bengt Larsson) Organization: Lund Institute of Technology, Sweden Lines: 101 In article <3124@psueea.UUCP> kirkenda@eecs.UUCP (Steve Kirkendall) writes: (a list of valid points concerning a "shar" program) >Did I miss anything? Did I get anything wrong? Does anybody know of an >existing format that comes close to these specs? Hmmm, one way to do it would be to write a little "unpacker" program (in C), and distribute it with the archive (in plain text). Suggested format for archive: (borrowed heavily from VMS_SHAR, the "shar" program for VMS) (unpacker program (optional) in plain text here. Let's call it "unpacker.c" For those who don't have it, extract it from here, compile it with "cc -o unpacker unpacker.c" and start unpacking) -- start part 1 -- file packer.txt 744 23642334 X The filename is on a line started by "file", followed by one space, X followed by the filename, a space, a (Unix) protection code (octal, X like for "uuencode") and a checksum. The filename must not contain a space. X X The archived file is mostly normal text. Control characters are escaped X with a backtick followed by three characters with the decimal (octal?) X value of the escaped character. Like `009 for a tab. The backtick X is itself escaped like `096. X X Long lines are folded. Normal lines start with an "X". Continuation V lines (like this one) start with a "V" (that is, newlines are to be skipped V before "V"). X X Since all lines start with a special character, it is possible to X archive archives (the archived file ends with a line not starting with X "X" or "V"). X X Trailing blanks are escaped, just like control characters. X Trailing blanks which result from splitting a long line are also X escaped. When run through the unpacker, all trailing spaces are X stripped first (trailing blanks may have been added somewhere). X X This is a line with some trailing spaces... `032 -- end part 1 -- Anything may come here (News headers, for example). We start the next part with a line which start with "-- start part 2". Note that the headers etc. may be in the middle of a file. All parts in the archive have the same length. Archived files are split routinely between parts. All the unpacker has to do is to look for a line starting with "-- end part xx" and then skip to a line beginning with "-- start part xx+1". The unpacker may (should?) check that the "xx" numbers are correct and in sequence. -- start part 2 -- X X Now we can say something about directories. Lets start a new file X "subdir.txt" in a subdirectory "doc". directory doc 744 file doc/subdir.txt 744 2353453 X X Now we are in the subdirectory. A directory is created by a line X started by "directory". The subdirectory may already exist (that is no X error). Anyway, the protection code is specified like for files. X X When files in a subdirectory are specified, directory parts are separated X by "/" (like in Unix). This should make it possible to write unpackers X for other environments (for example VMS). X X Let's say that the archive should be terminated with a line X "end archive". X -- end part 2 -- end archive The unpacking program could be run like: % unpacker prog.pck.01 prog.pck.02 prog.pck.03 ... or (Unix) % cat prog.pck.?? | unpacker. What do you think? The idea was that the "packer" program may be somewhat complex, but the "unpacker" should be small (could be distributed with the archive in plain text). The "packer" could accept lots of options (for example, which characters to escape, the maximum line length, the maximum part size, maybe maximum length for filenames etc.). Reasonable defaults should be provided. I think the "packer" should default to the "safest" format (escaping tabs and special characters for Bitnet). If the escaping mechanism is turned off, this is just a file splitter/extractor (may be used to split uuencoded GIF files, for example :-) Bengt Larsson. -- Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden Internet: bengtl@maths.lth.se SUNET: TYCHE::BENGT_L