Path: utzoo!utgpu!watserv1!watmath!uunet!samsung!brutus.cs.uiuc.edu!apple!genbank!NET.BIO.NET!kristoff From: kristoff@NET.BIO.NET (Dave Kristofferson) Newsgroups: bionet.general Subject: Re: new seqs Message-ID: Date: 2 Dec 89 14:20:29 GMT Sender: daemon@genbank.BIO.NET Lines: 53 > Much thanks for making available weekly updates via anonymous ftp. > > Unfortunately, I had some problems with two of the compressed files, > gb1030.seq.Z and gb1120.seq.Z, which were downloaded late Wednesday (Nov 22) > afternoon. First, we noticed that the compressed versions are older than > their uncompressed counterparts, which may mean nothing. However, after > uncompressing I discovered the following: > > gb1030.seq: > YSCAR01 ended in the middle of the TITLE line > M27235 ended in the middle of the sequence > a few LOCUS lines were missing > gb1120.seq: > the line "gb-newdata@genbank.ig.com" occurred 8 times before the > LOCUS line of various entries. > > Coincidentally, these are the only two gb files we downloaded in compressed > form. All the other files seem to be fine. > > I noticed a couple of other things in gb1030.seq, which may be features, > not bugs, but I thought you might find them interesting: > > entries occur multiple times (e.g., there are 4 entries for M27235, > not counting the truncated entry) > there are both unannotated and annotated versions of some sequences > (e.g., M27891/HUMCYS3A3, M28061/PMCTHYSY) > > I'd greatly appreciate learning whether your versions have these problems. > If not, I'll figure out what we're doing wrong on our end. Much thanks for > any insights you can provide. > > Colin Watanabe > > Sorry for the delay in responding as I have been on vacation the last week. I will refer the problem to our systems programmer in charge of this and he'll look into it. Thanks for taking the time to bring these problems to our attention. I should warn you and the rest of our users though that the new entries do not normally go through all of the more rigorous quality control procedures that are performed on major releases simply due to lack of available time. We are working on improving error detection, but there are a number of other issues to be resolved like handling corrected entries sent to us from LANL and EMBL, etc. Unfortunately we still have not achieved the "100% instant quality" goal. Sincerely, Dave Kristofferson GenBank On-line Service Manager kristoff@net.bio.net Brought to you by Super Global Mega Corp .com