Path: utzoo!attcan!utgpu!watserv1!watmath!uunet!genbank!apple!brutus.cs.uiuc.edu!psuvax1!rutgers!phri!roy From: roy@phri.nyu.edu (Roy Smith) Newsgroups: bionet.molbio.genbank Subject: Re: Distributing GenBank over the Internet Message-ID: <1989Dec15.004724.16304@phri.nyu.edu> Date: 15 Dec 89 00:47:24 GMT References: <1989Dec7.213027.8591@phri.nyu.edu> <573@mcclb0.med.nyu.edu> Sender: news@phri.nyu.edu (News System) Reply-To: roy@alanine.UUCP (Roy Smith) Organization: Public Health Research Institute, NYC Lines: 62 In <573@mcclb0.med.nyu.edu> smith@mcclb0.med.nyu.edu (Ross Smith) writes: > We are at the end of a slow link and FTPing all or most of a GENBANK > release, appart from the system manager time taken up, would be horribly > slow. Why should it take any system manager time? To the contrary, I envision one of the prime advantages of network access vs. physical media being the savings in human time and effort. I don't know about the braindead VMS system you run [note to outsiders: Ross works about 1/2 kilometer from me and we often abuse each other about our respective tastes in operating systems] but under Unix it would be simple to set up a totally automated system whereby each night the genbank ftp server was polled for new files, those files downloaded, and our local customizations done, perhaps ending in mailing a note to all interested parties saying that a new update was installed. With tape, no matter how much I automate the installation process, somebody still has to open the box and mount the reel of tape on the drive. Not to mention the labor saved at the other end in making and mailing all those tapes. By way of analogy, many people use news (and not even NNTP, just plain old b-news over dialup 2400 bps uucp) to distribute the uucp map files and process the raw data into a uucp path database totally without human intervention. I think the current size of the uucp map database is about 4 Mbytes. An order of magnitude smaller than genbank, but still a substantial amount of data. I'm also not sure why it should be slow. I regularly get about 4 kbytes/sec to anywhere on NSFNet. At that rate, even the absurdly large 10/30 update should only take about 15 minutes in its compressed form. Remember, the idea is to just download the changes, not the whole database. Of course, reality is determined to show me up; I just got gb1030.seq.Z (3.7 Mbytes) from genbank.bio.net using ftp and only got 2.2 kbytes/sec. But then again, it's 19:00 here, so it's 16:00 in California, still prime time. My guess is that if I tried it again in 6 or 8 hours I'd get double the throughput. It also looks like the NYSERNet gateway is down and we're going via JVNC; I don't know how much, if any, that hurts performance. Besides, when our 56 kbps link is replaced by fiber... > I think that the increasing size of the bank is a strong argument for > KEEPING the tape distributions not abandoning them since these tapes > serve as complete 'backup' of the bank each quarter. Why do you need (or want) a backup? I recycle the genbank tapes for other uses after a while. Should anything ever happen and you lose your on-line database, you can just ftp another copy from the source. Certainly, the data is so precious that the Keepers Of The Sacred Knowledge should be making backups of the master copy of the database, and storing duplicate tapes in fireproof vaults (they do, don't they?), but subscribers like us can always just go back to the source when that rare disaster strikes. > But I think it is a big mistake to think that the quarterly distribution > of Genbank via tape (1600/6250) can or should be abandoned any time soon. I probably agree with you that it can't be eliminated, but not that it shouldn't. Certainly, anybody who has decent Internet connectivity should be getting it over the net. And those who don't, should be scheeming to find a way to get that connectivity. Down with physical media! -- Roy Smith, Public Health Research Institute 455 First Avenue, New York, NY 10016 roy@alanine.phri.nyu.edu -OR- {att,philabs,cmcl2,rutgers,hombre}!phri!roy "My karma ran over my dogma"