Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/5/84; site mordor.UUCP Path: utzoo!linus!decvax!wivax!cadmus!harvard!seismo!ut-sally!mordor!jdb From: jdb@mordor.UUCP (John Bruner) Newsgroups: net.unix-wizards Subject: "tar" and non-8-bit byte machines Message-ID: <370@mordor.UUCP> Date: Mon, 19-Nov-84 14:36:55 EST Article-I.D.: mordor.370 Posted: Mon Nov 19 14:36:55 1984 Date-Received: Wed, 21-Nov-84 05:33:03 EST Distribution: net Organization: S-1 Project, LLNL Lines: 56 The S-1 Project at the Lawrence Livermore National Laboratory is porting UNIX to our own machine, the S-1 Mark IIA. One problem that we're currently trying to solve is the implementation of "tar". The crucial facts are: 1) The S-1 memory is organized into 36-bit words (addressable in 9-bit quarterwords). **Sigh.** 2) On the S-1, characters are nine bits and are stored one per quarterword. 3) UNIX does not distinguish file types (e.g. character vs. binary). The problem is this: we want to be able to read/write "tar" tapes containing ASCII text files on both the VAX and the S-1. The "obvious" mapping is for the S-1 to associate each 8-bit byte with the low-order 8 bits of a 9-bit quarterword, discarding or zero-filling the uppermost bit in the quarterword as appropriate. A different mapping is required for binary files (because the ninth bit is significant): the S-1 packs 9-bit quarterwords into 8-bit bytes. (There is hardware support for this conversion operation.) The issue is that, in order for the VAX to read S-1 text files and vice versa, text files must be stored using a different representation than binary files. There is no reliable way to determine whether a file should be "text" or "binary" when the tape is written, and no field in the "tar" header for recording this information even if the writer could reliably figure it out. If all files on the "tar" tape are stored with 9-bit quarterwords packed into 8-bit bytes, text files on the "tar" tape are unusable on the VAX. (Of course, we have programs which will pack/unpack them, but this must be done manually and it is a real hassle.) I don't want to define an incompatible "tar" format for the S-1. I have used UNIX systems for M68000's which write tapes with byte reversal problems so that I could not read them directly on our VAX (it was necessary to pipe the input through "dd conv=swab"), and I feel that the intent of "tar" format is to provide a standard means for information exchange. At this point, though, I can't think of any alternatives to this approach. P.S. Our next machine will have 32-bit words, but it will also have hardware tags. An image copy of a file on tape will include both the 32-bit data and a 4-bit tag (probably stored in a fifth byte). While the 9/8-bit packing problems will go away, the key problem still remains: a "tar" text file should contain only characters (not tags), so binary files and text files must be stored in a different format. I don't see how to do this with the current "tar" definition. -- John Bruner (S-1 Project, Lawrence Livermore National Laboratory) MILNET: jdb@mordor.ARPA [jdb@s1-c] (415) 422-0758 UUCP: ...!ucbvax!dual!mordor!jdb ...!decvax!decwrl!mordor!jdb