Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84 SMI; site sun.uucp Path: utzoo!watmath!clyde!cbosgd!ihnp4!ucbvax!decvax!decwrl!sun!guy From: guy@sun.uucp (Guy Harris) Newsgroups: net.unix-wizards Subject: TAR DOES NOT SWAP BYTES Message-ID: <2818@sun.uucp> Date: Tue, 24-Sep-85 02:06:40 EDT Article-I.D.: sun.2818 Posted: Tue Sep 24 02:06:40 1985 Date-Received: Thu, 26-Sep-85 06:21:23 EDT References: <235@thunder.UUCP> <604@neuro1.UUCP> Organization: Sun Microsystems, Inc. Lines: 53 > All I know is the tar program swaps bytes when writing a tape so > that a VAX running 4.2 must use dd to swab things before un-tar-ing them. "tar" does no such thing. The control information on a "tar" tape is in printable ASCII form, so that it's independent of byte order (and, with any luck, other greasy architectural details). "tar" tapes written on purely big-endian machines (3[67]0, M68K, etc.), purely little-endian machines (VAX, etc.), and mixed-up machines (PDP-11), can be read on machines of any other byte sex. Unless the files in question are text files, however, the data might not be directly usable on the target machine, but that's not just a problem of byte order. "cpio" has a rather stupid byte-swapping option which swaps the data but *not* the control information. Since most data does not consist of a huge uniform stream of "short"s or "long"s, an option to swap the data is useless. The control information, by default, consists of a bunch of "short"s (yes, even the file size and modification/access time are stored as pairs of "short"s), which should be swapped if the order of bytes in a "short" is different on the source and target machines, and a bunch of "char"s making up the file name which should not be swapped under any circumstances. This means, BTW, that dd if=/dev/rmt0 conv=swab bs= | cpio -ib doesn't work, since it swaps the bytes in the names of all created files. What they *should* have done was detect that the source and target machines had different byte orders by checking whether the "magic number" was 070707 or a byte-swapped 070707, and automatically byte-swap the header "short"s but not the path names or the data. However, there is a "-c" option to "cpio" which tells it to write the control information in - you guessed it - printable ASCII! I believe it had bugs in its System III incarnation, but you can read "cpio -c" tapes made on a machine with different byte order. The S5 "find" has an undocumented "-ncpio" option which works like the "-cpio" option, only it writes "cpio -c" instead of "cpio" tapes. If you must use "cpio", use "cpio -c"; however, "tar" is more universal - it's in V7, 4.xBSD, and Systems III and V. There are known cases of brain-damaged *hardware* swapping bytes. The case I know of is a big-endian Multibus machine with an extremely stupidly designed tape controller. If you write a tape on this machine, and want to read it in on a sane machine, you have to stick "dd" in front of the "tar" (or "cpio" or whatever). The rule for correctness of byte order in a tape controller is simple. If you have the string "Now is the time for all good parties to come to the aid of man" in memory, and tell the tape controller to write this to a tape, the first byte in the block should be a capital "n", followed by a lower-case "o", followed by a lower-case "w", followed by a blank, etc.. Violate this and you'll force everybody who didn't violate this to swap bytes when reading your tapes. Guy Harris Brought to you by Super Global Mega Corp .com