Path: utzoo!utgpu!watserv1!watmath!uunet!clyde.concordia.ca!ccu.umanitoba.ca!frist From: frist@ccu.umanitoba.ca Newsgroups: bionet.molbio.genbank Subject: Re: GenBank Release 64 was incomplete Message-ID: <1990Aug25.220335.25701@ccu.umanitoba.ca> Date: 25 Aug 90 22:03:35 GMT Organization: University of Manitoba, Winnipeg, Canada Lines: 87 In article of bionet.molbio.genbank Dave Benton (benton@karyon.bio.net) writes: >After checking the GenBank release 64 files which were on-line in the >genbank.bio.net ftp directory (~ftp/pub/db/gb-rel64) from early July >to mid-August, I can say unequivocally that GenBank Release 64.0 was >*not* incomplete, either as it appeared in that directory or as >distributed on mag tape. In the course of preparing floppy disk- >format files from those GenBank data files, we discovered a systematic >error in the way certain feature locations were formatted in the >files. This error affected about 1250 of the 185,079 features in >Release 64.0. I, therefore, applied a global correction to the files >and replaced the files in the ftp directory with the corrected files. First, I just wanted to make absolutely sure of the meaning of the posting. Is it correct to say that the location formatting error you spoke of affected Release 64.0 on ALL media released from early July to mid Aug, and not just the floppy disk version? Specifically, would this error be reflected in SUN tar tapes dated Jun 1990? Also, a quick browse through Release 64.0 indicates another systematic error. In all entries that I have looked at in which a gene is divided up into several exons, the feature key 'mRNA' is used to denote what should be 'prim_transcript'. (An example is shown below.) In fact, I have not yet found an example of a mature (ie. spliced using join()) mRNA in any entries that I have examined. Admittedly, I have not had a chance to really do a thorough search. While we're on the subject, why do some entries have mRNA and CDS, and others just CDS? For example, many cDNAs have both features, which have identical locations, whereas others have only the CDS. Example: LOCUS CHKACACB 5462 bp ds-DNA VRT 04-AUG-1986 DEFINITION Chicken cardiac alpha-actin gene, clone lambda-AC7, complete cds. ACCESSION X02212 K02256 KEYWORDS actin; alpha-actin; alpha-cardiac actin. SOURCE Chicken genomic DNA, clone lambda-AC7 [1],[2]. ORGANISM Gallus gallus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Aves; Neornithes; Neognathae; Galliformes; Phasianidae; Gallus gallus. REFERENCE 1 (bases 841 to 897) AUTHORS Chang,K.S., Zimmer,W.E.Jr., Bergsma,D.J., Dodgson,J.B. and Schwartz,R.J. TITLE Isolation and characterization of six different chicken actin genes JOURNAL Mol. Cell. Biol. 4, 2498-2508 (1984) STANDARD full staff_review REFERENCE 2 (bases 1 to 5462) AUTHORS Chang,K.S., Rothblum,K.N. and Schwartz,R.J. TITLE The complete sequence of the chicken alpha-cardiac actin gene: A highly conserved vertebrate gene JOURNAL Nucleic Acids Res. 13, 1223-1237 (1985) STANDARD full staff_review COMMENT [1] also sequenced part of the 3' terminal fragment. FEATURES Location/Qualifiers mRNA 299..5075 /note="actin mRNA" intron 339..820 /note="actin mRNA intron A" intron 970..1801 /note="actin cds intron B" intron 2127..2751 /note="actin cds intron C" intron 2914..3025 /note="actin cds intron D" intron 3218..4215 /note="actin cds intron E" intron 4398..4756 /note="actin cds intron F" CDS join(841..969,1802..2126, 2752..2913,3026..3217, 4216..4397,4757..4900) /note="cardiac alpha-actin" BASE COUNT 1376 a 1280 c 1179 g 1627 t ORIGIN 3 bp upstream of SmaI site. =============================================================================== Brian Fristensky frist@ccu.umanitoba.ca Assistant Professor Dept. of Plant Science University of Manitoba Winnipeg, MB R3T 2N2 CANADA Office phone: 204-474-6085 FAX: 204-275-5128 ===============================================================================