Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!ucbvax!ucsfcgl!seibel From: seibel@cgl.ucsf.edu (George Seibel) Newsgroups: comp.arch Subject: Re: Extremely Fast Filesystems Message-ID: <14923@cgl.ucsf.EDU> Date: 8 Aug 90 21:40:03 GMT References: <5539@darkstar.ucsc.edu> <13285@yunexus.YorkU.CA> <30728@super.ORG> <13667@cbmvax.commodore.com> <5286@mace.cc.purdue.edu> Sender: daemon@cgl.ucsf.edu Reply-To: seibel@cgl.ucsf.edu (George Seibel) Organization: Computer Graphics Lab, UCSF Lines: 39 In article <5286@mace.cc.purdue.edu> nvi@mace.cc.purdue.edu (Charles C. Allen) writes: >> I submit that your situation is something of an unusual case, and is >> likely to remain unusual for at least a decade, perhaps 2. Few machines >> (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB >> (I've never even seen a file larger than 100MB, even on mainframes). > >Until recently, the "standard" media for transporting files has been >9-track 6250 tape, which holds around 200M. Until recently, all our >data files were less than 200M (hmm... I hope you see the >correlation). Now that we have some 8mm tape drives, we routinely >have 400M files. We'd have bigger ones, but all our disks are little >SCSI 600-700M thingies (access time is not very critical), and we >can't easily have a single file span volumes. This is for high energy >physics data analysis. The important question here is: "what are these large files worth to you?" It sounds as though you've always had datasets larger than the limits imposed on you by hardware/software, and that you likely got by in the past (and present) by splitting data into multiple files. I generate a lot of data from MD simulations, but find that it's more convenient to split it into manageable chunks that are far smaller than 4GB. The size of "manageable" is of course determined by a variety of hardware/ software performance/capacity issues, plus economics and politics. At any rate, I've been splitting data files up for years, and I bet everyone else has been as well. I already have the software in place to deal with multiple files, and don't expect that the ability to have a gigantic single file will make a vast improvement in my life. I'm sure that someone out there needs huge files, but I also suspect there is a price to be paid for going to the next higher increment of address size. I would rather not pay that price until the performance level of network, cpu, memory, mass storage, etc has come to such a level that my "manageable" chunks of data are approaching the GB range. I guess it's up to market analysis to decide when "enough" people have reached the point where the benefits of a larger address space are worth the cost. This will of course depend on the good work of you designers and engineers. It's a balancing act. George Seibel, UCSF seibel@cgl.ucsf.edu