Path: utzoo!attcan!uunet!samsung!zaphod.mps.ohio-state.edu!uwm.edu!lll-winken!sun-barr!newstop!sun!khb From: khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) Newsgroups: comp.arch Subject: Re: Extremely Fast Filesystems Message-ID: Date: 7 Aug 90 20:29:32 GMT References: <13285@yunexus.YorkU.CA> <30728@super.ORG> <13667@cbmvax.commodore.com> <1990Aug7.190719.7907@caen.engin.umich.edu> Sender: news@sun.Eng.Sun.COM Organization: Sun MegaSystems Lines: 55 In-reply-to: pha@caen.engin.umich.edu's message of 7 Aug 90 19:07:19 GMT In article <1990Aug7.190719.7907@caen.engin.umich.edu> pha@caen.engin.umich.edu (Paul H. Anderson) writes: ... Populations Studies Center, for example, would like nothing better than to quickly analyze 5 gigabyte datasets (hence my earlier request for large RAM systems). Furthermore, many such datasets exist. The 1990 census is just one 5 gigabyte file - there are similar files for the last 100 years or more. Likewise for China, Russia, Europe, and more. Analyzing these things quickly is not currently very easy, but that doesn't mean that people don't want to do it. ... Humm. In estimation problems there are lots of ways to skin cats. Algorithms which have huge datasets, but "small" models do not require huge "core" storage. In the satallite tracking biz, some experiements (like GPS baselines) go on for years, and Tb of data could be necessary if one formed the obvious T A A and proceeded to use elimination from there. Back when I did that sort of work, we employed Square-Root Information Filters, and/or UDU**T decomposition techniques. If, for the sake of argument, your model has 70 independent variables, the bulk of the "core" needed is (70+71)/2 = 71 words of storage _independent_ of the size of the dataset. Of course, one also gets estimates in "real time" (viz as fast as the data are available). The "naive" approach would require that the entire dataset fit in "core". I am sure that there are many problems which require really huge memories ... but I am certain that use of appropriate algorithms can limit the number of such "hogs" considerably. Those interesed in SRIF and UD techniques might wish to peruse Factorization Methods for Discrete Sequential Estimation ISBN 0 12 097350 2 -- ---------------------------------------------------------------- Keith H. Bierman kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM SMI 2550 Garcia 12-33 | (415 336 2648) Mountain View, CA 94043