Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!ll-xn!ames!pioneer!eugene From: eugene@pioneer.arpa (Eugene Miya N.) Newsgroups: comp.arch Subject: Re: Disk Striping (description and references) plus class brief Message-ID: <2432@ames.arpa> Date: Mon, 3-Aug-87 17:08:09 EDT Article-I.D.: ames.2432 Posted: Mon Aug 3 17:08:09 1987 Date-Received: Tue, 4-Aug-87 05:06:25 EDT Sender: usenet@ames.arpa Reply-To: eugene@pioneer.UUCP (Eugene Miya N.) Organization: NASA Ames Research Center, Moffett Field, Calif. Lines: 125 This is a follow up (I got lots of letters, so I hope interest can be stirred and more work done in this area (striping)). I have changed the order of Chuck's questions to do the simpler first. >OK. I'll bite. And what classes are defined and what do they mean? Supercomputers purchased by the US. Govt. were (are) rated for their performance). The rating is informal and unofficial (emphasis) done for procurement purposes. The work is done by the Dept. of Energy (prior to that ERDA and prior to that the AEC). The rating is arbitrary and does not involve any official measurement tool. What I say is my understanding of how the rating works. The rating was developed to Sid Fernbach and George Michael which the two were are Lawrence Livermore Lab (before they became LLNL). I have seen charts on the wall at LLNL which detail some of this. Supercomputers come in 6 "classes." Each class should be a factor of 4 to 16 more "powerful" than the preceding class depending on who you talk to. Classes are defined more by existing general-purpose machines which sit in a class. A Class 6 machine is something of the power of a Cray 1 or Cyber 205, or any of the Japanese Machines. Class 5 computers included the ILLIAC IV, CDC 7600, class 4: CDC 6600, IBM 370/195. There are discrepencies: the ILLIAC had an I/O bandwidth higher than the Cray could ever have in the near term future. Classes came about for the same reasons Berkeley Unix from 1.0, 2.0, 3.0, and 4.0 BSD: lawyers when ever new agreements or rules had to be written, a new class or Distribution have to be negotiated. (e.g. what would an agreement for a 5.0BSD look like?...shutter!) Now: frequent question: where is `my' machine (typically an Apple, VAX or SUN). These machines don't rate. The definition of a supercomputer is relative, so at any given time, those give machines don't rate, and a class is closed. Sid said a VAX could be a "Class 1/2." Problems with classes: the most obvious problem is handling parallelism. The MPP and the Connection machine are good cases which don't fit this rating scheme. This includes problems like I/O. The new database machines should also make some of this rating interesting Personal note: when I first saw classes I was reminded rating climbs (rock, etc.) which had early versions running from 1 to 6 (or I to VI) [why not 1 to 5 or 1 to 10]. This is got me curious and I eventually met George and Sid. Also I know that lots of DOE/ex-AEC people are or were climbers like E. Teller. What is interesting is that climbing is going thru a similar problem with their closed ended rating system (breaking into 7s). Sorry for the digression. This is the second time I have described class to the arch group. >What is "disk stripping"? >-- Chuck Oh, you caught my typo! Disk stripping is the process of cleaning the surface of a platter before the magnetic material is deposited ;-). I meant to say DISK STRIPING. This is the distribution of data across multiple "spindles" in order to 1) increase total bandwidth, 2) for reasons of fault-tolerance (like Tandems), 3) other miscellaneous reasons. Very little work has been done on the subject yet a fair number of companies have implemented it: Cray, CDC/ETA, the Japanese manufacturers, Convex, and Pyramid (so I am informed), and I think Tandem, etc. Now for important perspective: It seems that striping over 3-4 disks like in a personal computer is a marginal proposition. Striping over 40 disks, now there is some use. The break even-point is probably between 8-16 disks (excepting the fault tolerance case). A person I know at Amdahl boiled the problem down to 3600 RPM running on 60 HZ wall clock: mechanical bottlenecks of getting data into and out of a CPU from a disk. The work is not glamourous as making CPUs, yet is just as difficult (consider the possibility of losing just one spindle). The two most cited papers I have seen are: %A Kenneth Salem %A Hector Garcia-Molina %T Disk Striping %R TR 332 %I EE CS, Princeton Univerity %C Princeton, NJ %D December 1984 %A Miron Livny %A Setrag Khoshafian %A Haran Boral %T Multi-Disk Management Algorithms %R DB-146-85 %I MCC %C Austin, TX %D 1985 Both of these are pretty good reports, but more work needs to be done in this area, hopefully, one or two readers might seriously. The issue is not simply one of sequentially writing bits out to sequentially lined disks. I just received: %A Michelle Y. Kim %A Asser N. Tantawi %T Asynchonous Disk Interleaving %R RC 12496 (#56190) %I IBM TJ Watson Research Center %C Yorktown Heights, NY %D Feb. 1987 This looks good, but what is interesting it that it does not cite either of the two above reports, but quite a few others (RP^3 and Ultracomputer based). Kim's PhD disseration is on synchronous disk interleaving and she has a paper on IEEE TOC. Another paper I have is Arvin Park's paper on IOStone, an IO benchmark. Park is also at Princeton under Garcia-Molina (massive memory VAXen). I have other papers, but these are the major ones, just starting thinking Terabytes and Terabytes. From a badge I got at ACM/SIGGRAPH: Disk Space: The Final Frontier From the Rock of Ages Home for Retired Hackers: --eugene miya NASA Ames Research Center eugene@ames-aurora.ARPA "You trust the `reply' command with all those different mailers out there?" "Send mail, avoid follow-ups. If enough, I'll summarize." {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!aurora!eugene