Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!ut-sally!std-unix From: MIKEMAC%UNBMVS1.BITNET@wiscvm.wisc.edu (Michael MacDonald) Newsgroups: comp.std.unix Subject: Re: tar vs. cpio Message-ID: <8208@ut-sally.UUCP> Date: Fri, 5-Jun-87 14:04:28 EDT Article-I.D.: ut-sally.8208 Posted: Fri Jun 5 14:04:28 1987 Date-Received: Wed, 10-Jun-87 00:47:21 EDT References: <8188@ut-sally.UUCP> Sender: std-unix@ut-sally.UUCP Reply-To: MIKEMAC%UNBMVS1.BITNET@wiscvm.wisc.edu Lines: 77 Approved: jsq@sally.utexas.edu (Moderator, John Quarterman) From: MIKEMAC%UNBMVS1.BITNET@wiscvm.wisc.edu (Michael MacDonald) I have just finished working on a CPIO tape reader and approx 1 year ago a TAR tape reader for our IBM3090 180/vf running MVS/XA. The following comments may be of interest as they come from a slightly different point of view. I do not have significant *ix experience and the following comments come as a result of trying to pick apart these tapes when they are used for data interchange. TAR and CPIO are *used* for purposes of backup AND data interchange. TAR Format comments. 1) Data is written as blocks of 512 bytes. This allows for faster processing and this is important for BIG files. [ Most implementations allow using tape blocks larger than that. -mod ] 2) There is room left in the header. This allows for customization by a site while still allowing other sites to read the tape without using the customized version (if they do it right). 3) The length of the NAME and the LINKNAME field is not enough. Extending the length to 256 would extend the header to 2 blocks but I think that extending the length outweighs the disadvantages. [ In addition to #define NAMSIZ 100 char name[NAMSIZ]; POSIX Section 10.1 also has #define PFXSIZ 155 char prefix[PFXSIZ]; which is used when name isn't big enough. The total of the two is set to match the minimum permissible value of PATH_MAX. -mod ] 4) All of the tape drives that I have worked with (not that many) are capable of writing a short block. If TAR would recognize a physical end of file rather than two blocks of hex 00's. This would solve a number of problems with TAR. 5) Limited amount of Unix dependent information in the header. If a *backup* system is used for data interchange is it really necessary to add many Operating System dependent features. Are the advantages gained by using these dependencies *really* advantages even in a backup system? CPIO Format comments. 1) Data is not block oriented. This slows down processing considerably. 2) There is no room left in the header. No customization possible (without also sending the customized program). 3) Is 128 that much better than 100? See TAR note 3. 4) The CPIO end of file mark (TRAILER!!!) why not a physical EOF See TAR note 4. 5) When it comes to OS dependent information the CPIO header is full of it. 6) After writing the CPIO tape reader I came across a ?serious? problem. (The following note is from the unix manual page cpio(4) The h_name field is "h_namesize rounded to word" long. The header must begin on a word boundary (although not documented). The wordsize of the machine is not a CPIO option (as far as I can tell). This means CPIO tapes cannot be read on a machine with a different wordsize. I question if this "feature" should be standardized without at least a wordsize option. Michael MacDonald Software Specialist, School of Computer Science University of New Brunswick Po. Box 4400 Fredericton, New Brunswick CANADA E3B 5A3 (506) 453-4566 Netnorth/BITNET: MIKEMAC@UNB Disclaimer: The opinions stated are mine, no one likes them around here either. Volume-Number: Volume 11, Number 50