Path: utzoo!attcan!utgpu!cunews!bnrgate!brtph3!brchh104!brchs1!bnr.ca!rice.edu!sun-spots-request From: iapsd!hopi!glenn@uunet.uu.net (Glenn Herteg) Newsgroups: comp.sys.sun Subject: Myths about tape block sizes Keywords: Hardware Message-ID: <839@brchh104.bnr.ca> Date: 14 Dec 90 04:39:29 GMT Sender: news@brchh104.bnr.ca Organization: Sun-Spots Lines: 48 Approved: Sun-Spots@rice.edu X-Sun-Spots-Digest: Volume 9, Issue 405, message 13 X-Note: Submissions: sun-spots@rice.edu, Admin: sun-spots-request@rice.edu In v9n397, wsrcc!wolfgang@uunet.uu.net (Wolfgang S. Rupprecht) writes: >SCSI itself has a similar limit. Thats why one can't get more than 126 >blocks of 512 bytes in one tape read or write. Ideas like this have tended to propagate into the lore about parameters you should specify to user-level tape commands. For example, setenv TAPE /dev/nrst8 tar cvbfle 126 $TAPE tree has often been considered the way to "efficiently" create a QIC-24 tape archive. However, regardless of whether such a limitation exists at the hardware level, current SunOS releases (I use 4.0.1 on a 3/50) do a good job of hiding this from the user. For a long time I, too, didn't understand this, and I often waited hours as my 1/4" cartridge drive sawed back and forth. Recently, though, I have run experiments which prove that much larger user block sizes work just fine, and FAR FASTER. For example, dd if=diskfile of=$TAPE bs=1000b can be used to transfer the given diskfile (if its size is a multiple of 512 bytes). This block size is a big improvement over "bs=126b". Reading the tape back afterwards with dd if=$TAPE bs=1000b | cmp - diskfile proves that the data was written correctly. (I don't know how much of a performance difference it makes, but note that I often access files from a remote-mounted filesystem [Wren, 3/60] in such transfers.) Now my only questions are, now that we know the hardware value is not the limit, what is the actual limit, and what is the optimal block size to specify on tar, dd, and similar commands? Certainly the optimal size must be a tradeoff between the speed of the *disk* (and/or network connection) you're reading from / writing to, and the time penalty for stopping and starting the tape drive. You want to advantageously overlap disk and tape i/o, just as network analysts have found that optimal network throughput is achieved not by huge blocks, but by balancing the time spent in generating the data with the time spent in communicating it. The best performance comes when both the CPU and the network are simultaneously active, not when one has to wait for the other to finish handling a large block. In the case of a QIC tape, however, the cost of starting and stopping the streaming action to a large extent seems to outweigh the cost of non-overlapping computation and communication. So now that the truth is revealed, has anyone done more extensive testing, and could they provide some guidance to all of us so we can collectively save years of wasted time?