Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!husc6!rutgers!labrea!decwrl!pyramid!prls!mips!earl From: earl@mips.UUCP (Earl Killian) Newsgroups: comp.arch Subject: Re: Disk rotational speed vs. striping vs. parallel heads Message-ID: <600@gumby.UUCP> Date: Sun, 16-Aug-87 01:02:28 EDT Article-I.D.: gumby.600 Posted: Sun Aug 16 01:02:28 1987 Date-Received: Tue, 18-Aug-87 03:21:16 EDT References: <2432@ames.arpa> <3721@well.UUCP> <2838@phri.UUCP> <155@dolphy.UUCP> <653@ima.ISC.COM> Lines: 86 In article <653@ima.ISC.COM>, johnl@ima.ISC.COM (John R. Levine) writes: > In article <5557@prls.UUCP> weaver@prls.UUCP (Michael Gordon Weaver) writes: > > If higher transfer rates were required by a large part of the > > market, the rotation speed of the drives could be increased from > > the current typical 3600 rpm to say 10,000 rpm. This would be > > expensive, but much cheaper than using a vacuum. > If your computer had the I/O speed, there are all sorts of > straightforward tricks to speed up the disk data rate. For example, > you could run all of the disk heads at once so that a disk cylinder > rather than appearing as N tracks each M bytes long looks like one > track N*M bytes long, but with the same time to read or write a > track. I was wondering when this would come up. Such drives are made. For example, I believe I've seen an ad for a parallel head version of the nifty little Fujitsu 8in drives. Unfortunately, it was a lot more expensive than the regular version. I can't imagine why, other than it is a specialty item not produced in high volume. Instead of transfering at 2.4 Mbyte/s, it probably transfers at something like 14.4 Mbyte/s (I'm guessing that it has 6 surfaces). I don't know what they did to the SMD interface to make it work at these speeds. The problem is that with today's software, this may not make much difference. Suppose you do disk i/o in 8K byte chunks. At 2.4 Mbyte/s it takes 3.4ms to transfer your chunk. The average seek time is 20ms, so shrinking the transfer time to .56ms saves you very little. To take advantage of these drives, you need software (either in the os, or in the disk controller) that reads a cylinder at a time. Since some disk controllers do read and cache a full track at a time, this is not implausible. Doing caching in the controller also limits the need for high bandwidth i/o buses, since only the controller would see the 14.4 Mbyte/s; transfers of individual blocks from its cache would go at the maximum i/o bus rate, which is probably less than 14.4 Mbyte/s with today's i/o buses. To get back the issue that started this all, it seems to me that there are two reasons to do disk striping. (1) you're already using parallel head transfers and you need still more bandwidth for a single task. (2) you need more bandwidth for a single task and parallel head drives are too expensive. (3) you can't use parallel head transfers because your i/o bus can't hack the bandwidth. Striping can increase the disk thruput within certain limits without increasing the peak bandwidth requirement for randomly scattered (i.e. non-contiguous) files. To see (3), consider reading a file one (a) one drive, and (b) many drives. (a) The thruput is blocksize / (seek + rotate + blocksize / trate). The peak transfer rate is trate. If blocksize / trate << seek + rotate, then this is seek time limited. (b) You simultaneously seek on all the drives, and one-at-time transfer blocks. The peak transfer rate is still trate, because you serialize the actual transfers. The thruput is N * blocksize / (seek + rotate + queue + blocksize / trate) where queue is the time waiting to serialize. If N * blocksize / trate << seek + rotate, then the queuing delay is small and can be ignored, so you've achieved N times the thruput. (My queuing theory text doesn't have a formula for a M/D/1 system with finite population, and I'm not going to grind through the Markov model to figure out, so I'll leave it as "small" if <<.) This is of course the effect of overlapped seeks in a time sharing system made to work for a single task. However, you'd probably have been better off simply using contiguous allocation instead of striping, to get blocksize / trate > seek + rotate. For example, going from 8K byte to 64K byte chunks raises your overall thruput from 350 Kbyte/s to 1385 Kbyte/s, which is almost a 4x improvement. With 64K byte transfers the disk is now spending 58% of the time actually doing i/o instead of just 15%. Maybe fragmentation makes striping attractive compared to contiguous allocation? If you allow simultaneous transfers in addition to simultaneous seeks, then striping's peak bandwith requirement is no longer the transfer rate of a single disk, but N times that (but the queuing delay goes away). No advantage over a parallel head disk then. Anyway, I guess this is a long-winded way of saying striping looks interesting once you've gotten yourself a parallel head disk, done contiguous disk allocation, and found you still don't have enough bandwidth for a single task. And you've got enough i/o channel bandwidth to support it all. And I guess that's exactly where some of the small memory supercomputers are these days.