Xref: utzoo alt.sources.d:663 comp.sources.d:5605 Path: utzoo!attcan!uunet!shelby!bu.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!texbell!ficc!peter From: peter@ficc.ferranti.com (Peter da Silva) Newsgroups: alt.sources.d,comp.sources.d Subject: Re: Unnecessary tar-compress-uuencodes Message-ID: <3+M4M=1@xds13.ferranti.com> Date: 13 Jul 90 14:12:48 GMT References: <15652@bfmny0.BFM.COM> <3114@psueea.UUCP> <1990Jul10.203015.27282@eci386.uucp> <5256@plains.UUCP> <3124@psueea.UUCP> Reply-To: peter@ficc.ferranti.com (Peter da Silva) Organization: Xenix Support, FICC Lines: 65 In article <3124@psueea.UUCP> kirkenda@eecs.UUCP (Steve Kirkendall) writes: > Here's an idea: Lets compromise! Come up with a format that really works! I've suggested this before... the software tools format. > 1) The archive should be plain-text. That is, each text file in the archive > should be easy to locate within the archive, and it should be readable > without the need to extract it. Headers and tailers are marked by "-h-" and "-t-". Other sequences could be added, like "-d-" for directories. > 2) The format would only be used to combine several text files into a single > text file. If you really must include a non-text file, then uuencode > that one file. Exactly. > 3) Archives should begin with a table of all printable ASCII characters, > so we can tell when transliteration has gone awry. That's a nice enhancement. > 4) The archive program should split long lines when the archive is created, > and rejoin them during extraction. Not currently supported, but see below. > 5) Tabs should be expanded to spaces. The extraction program should convert > groups of spaces back into tabs. No. Tabs should be converted to a unique escape sequence. > 6) The program that creates the archive should give a warning message when > a file's whitespace is likely to be reformated. For example, spaces at > the end of a line are a no-no. No, spaces at the end of a line should be marked. > 7) The extraction program should be clever enough to ignore news headers and > other introductory text, just for the sake of convenience. Anything not between "-h-" and "-t-" can be safely ignored. > 8) It should be possible to embed one archive inside another. This ability > probably wouldn't see much use, but lack of the ability could sure be a > nasty surprise to somebody. "What? You mean it only works on *some* > text files?" Leading dashes are escaped with another dash. > 9) Should we use trigraphs for some of the more troublesome ASCII characters? > The extraction utility could convert them back into real characters. Yes, but not trigraphs. A two-character sequence should be enough... how about "@x" for some value of x? @t would be tab, @! would be |, and so on. Of course "@@" would be "@". Begin *all* lines between -h- and -t- with X, or C if it's a continuation of the previous line. Trailing spaces would have a "@" appended. (of course, some other escape character could be used... Kernighan and Pike use "@" for other software tools tools, is all.). Or how about this: begin each line with T for text, C for continued text, and M for uuencoded lines? -- Peter da Silva. `-_-' +1 713 274 5180.