Path: utzoo!attcan!uunet!bionet!life!jeh From: jeh@life (Jamie Hayden) Newsgroups: bionet.molbio.genbank.updates Subject: (none) Message-ID: <9003151805.AA03699@life.LANL.GOV> Date: 15 Mar 90 18:05:03 GMT Sender: daemon@genbank.BIO.NET Lines: 1963 LOCUS ALELTR2 346 bp ss-RNA VRL 16-DEC-1985 DEFINITION Rous-associated virus 2 (RAV-2) LTR with reverse transcriptase endonuclease cleavage sites. ACCESSION K00993 KEYWORDS cleavage site; endonuclease cleavage site; long terminal repeat; reverse transcriptase endonuclease cleavage site. SOURCE Rous-associated virus 2 replication form I DNA, clone RAV2-2 [1], and plasmids pPG1 and pGJ14 [2]. ORGANISM Rous associated virus type 2 Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; Retroviridae; Oncovirinae; Type C oncovirus group; Avian leukosis viruses. REFERENCE 1 (bases 1 to 346) AUTHORS Duyk,G., Leis,J., Longiaru,M. and Skalka,A.M. TITLE Selective cleavage in the avian retroviral long terminal repeat sequence by the endonuclease associated with the alpha-beta form of avian reverse transcriptase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 80, 6745-6749 (1983) STANDARD full staff_review REFERENCE 2 (bases 1 to 346) AUTHORS Skalka,A.M., Duyk,G., Longiaru,M., DeHaseth,P., Terry,R. and Leis,J. TITLE Integrative recombination -- a role for the retroviral reverse transcriptase JOURNAL Cold Spring Harb. Symp. Quant. Biol. 49, 651-659 (1984) STANDARD full staff_review COMMENT Reverse transcriptase associated endonuclease (purified from avian sarcoma virus) cleavage sites have been mapped in two tandemly linked Rous-associated virus-2 LTR sequences. The enzyme may be involved in viral cDNA integration in the host, since it generates a 6 bp staggered overlap that spans the junction. The clone sequence (RAV2-2) corresponds to the unintegrated replicative form (RF) I of RAV-2. Draft entry for [2] kindly provided by A. Skalka, 15-AUG-1985. FEATURES from to/span description LTR < 1 154 ltr A LTR 155 > 346 ltr B cutss 105 104 (c) rt-endonuclease cleavage site revision 131 131 a in [2]; c in [1] cutss 130 131 rt-endonuclease secondary cleavage site cutss 151 152 rt-endonuclease primary cleavage site cutss 158 157 (c) rt-endonuclease primary cleavage site cutss 214 215 rt-endonuclease secondary cleavage site cutss 223 222 (c) rt-endonuclease secondary cleavage site cutss 245 244 (c) rt-endonuclease secondary cleavage site cutss 304 305 rt-endonuclease secondary cleavage site BASE COUNT 96 a 69 c 89 g 92 t ORIGIN EcoRI site. 1 aattccgcat tgcagagata ttgtatttaa gtgcctagct cgatacaata aacgccattt 61 gaccattcac cacattggtg tgcacctggg ttgatggtcg gaccgttgat tccctgacga 121 ctacgagcac atgcatgaag cagaaggctt cattaatgta gtcttatgca atactcttgt 181 agtcttgcaa catgcttatg taacgatgag ttagcaacat gccttataag gagagaaaaa 241 gcaccgtgca tgccgattgg tgggagtaag gtggtatgat cgtggtatga tcgtgccttg 301 ttaggaaggc aacagacggg tctaacacgg attggacgaa ccactg // LOCUS CHKMHT1 616 bp ds-DNA VRT 04-AUG-1986 DEFINITION Chicken cellular mht gene (analogue to v-mht oncogene), exon 2. ACCESSION K03047 KEYWORDS mht oncogene; oncogene; proto-oncogene. SEGMENT 1 of 2 SOURCE Chicken fetal liver DNA (library of B.Paterson), clone lambda-c-mht1. ORGANISM Gallus gallus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Aves; Neornithes; Neognathae; Galliformes; Phasianidae. REFERENCE 1 (bases 1 to 616) AUTHORS Flordellis,C.S., Kan,N.C., Lautenberger,J.A., Samuel,K.P., Garon,C.F. and Papas,T.S. TITLE Analysis of the cellular proto-oncogene mht/raf: Relationship to the 5' sequences of v-mht in avian carcinoma virus MH2 and v-raf in murine sarcoma virus 3611 JOURNAL Virology 141, 267-274 (1985) STANDARD full staff_review COMMENT The c-mht mRNA transcript is 4.0 kb in length, and can be detected with a probe made of DNA sequence 5' to the sequence reported here. Therefore, it is probable that an exon or exons exist 5' to the first exon reported here (exon 2). Using v-mht as a probe two larger RNAs of 5.5 kb and >10 kb were detected. These are believed to be precursors to the 4.0 kb mRNA. FEATURES from to/span description pept / 387 + 533 mht protein, exon 2 (AA at 387) IVS < 1 386 mht intron A IVS 534 > 616 mht intron B BASE COUNT 179 a 113 c 120 g 204 t ORIGIN 111 bp upstream of DdeI site. 1 agcttattaa attaatagcc tgaaatgaga ggtttaaatg taactgcctg aattaattaa 61 tccaacagag ccagtaattt atcaatgtat taaatgtgat ttaatgggga ctgaggtgaa 121 ctgtggcaca tttcataatg gcagaagtag acttaaacgt gttagatctg tgtgccaagt 181 ttttaaagtt ttttcttaag ttaaactgta aatggtttgg ctgaccaaat agtttagtta 241 acatgtaaat ctcaacattg cccatgtttg aaatttcgtg tatattactg cttttcgtag 301 attacactga ggagtatttt ttttgactgc tagagggggg ggaaaaataa ccacaagttt 361 ctctctcttt tcttcttagc tcccagcaca ggtattctac acctcatgtc tttacattca 421 acacatcaaa tccttcctct gagggcaccc tttcccaaag acagcgatct acatccacac 481 caaatgtcca catggttagc actacaatgc cagtagacag ccggataatt gaggtaatat 541 tggtgaggat ggatcaccat tcagttacgc tgaatttgtg actggctatc tgatagtcac 601 tggttggtgt tttttc // LOCUS CHKMHT2 311 bp ds-DNA VRT 02-MAY-1986 DEFINITION Chicken cellular mht gene (homologue of v-mht oncogene), exons 3 and 4. ACCESSION K03048 KEYWORDS mht oncogene; oncogene; proto-oncogene. SEGMENT 2 of 2 SOURCE Chicken fetal liver DNA (library of B. Paterson), clone lambda-c-mht1. ORGANISM Gallus gallus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Aves; Neornithes; Neognathae; Galliformes; Phasianidae. REFERENCE 1 (bases 1 to 311) AUTHORS Flordellis,C.S., Kan,N.C., Lautenberger,J.A., Samuel,K.P., Garon,C.F. and Papas,T.S. TITLE Analysis of the cellular proto-oncogene mht/raf: Relationship to the 5' sequences of v-mht in avian carcinoma virus MH2 and v-raf in murine sarcoma virus 3611 JOURNAL Virology 141, 267-274 (1985) STANDARD full staff_review COMMENT See comment in segment 1. FEATURES from to/span description pept + 39 66 mht protein, exon 3 157 / 284 mht protein, exon 4 IVS < 1 38 mht intron B IVS 67 156 mht intron C IVS 285 > 311 mht intron D BASE COUNT 80 a 66 c 65 g 99 t 1 others ORIGIN About 3.4 kb after segment 1. 1 atacacttta ttaactgttg gtattttctc tgttttagga tgcaattcga aaccatagtg 61 aatcaggtat ggcttccggg ggtagaagta tgataatcct cttttgtgtt tattttatgg 121 tttgtnactt ctttttcttc tgttttcaca ttgaagcttc accctccgct ctgtctggga 181 gtcctaacaa tatgagcccg actggctggt ctcagcccaa aacgccagtc ccagcccaga 241 gggagagagc ccccggaacg aatacacagg agaaaaataa aattgtaagt attttcaaac 301 cactgtatgt g // LOCUS COBHSN 6361 bp ss-RNA VRL 15-MAR-1990 DEFINITION Bovine coronavirus hemagglutinin-esterase (HE), spike protein, and 4.9, 4.8, 12.7, and 9.5 kDa nonstructural protein RNAs, complete cds, and integral membrane protein, 5' end. ACCESSION M30612 M30613 M30614 KEYWORDS hemagglutinin-esterase protein; integral membrane protein; nonstructural protein; spike protein. SOURCE Bovine coronavirus (strain Mebus), cDNA to viral RNA. ORGANISM Bovine coronavirus Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; Coronaviridae. REFERENCE 1 (bases 1 to 1307) AUTHORS Kienzle,T.E., Abraham,S., Hogue,B.G. and Brian,D.A. TITLE Structure and orientation of expressed bovine coronavirus hemagglutinin-esterase protein JOURNAL Unpublished (1989) Univ. TN, Dept. of Microbiology, Knoxville, TN. STANDARD full staff_review REFERENCE 2 (bases 1248 to 5396) AUTHORS Abraham,S., Kienzle,T.E., Lapps,W. and Brian,D.A. TITLE Deduced amino acid sequence of bovine coronavirus spike protein and identification of the internal proteolytic cleavage site JOURNAL Unpublished (1989) Univ. TN, Dept. of Microbiology, Knoxville, TN. STANDARD full staff_review REFERENCE 3 (bases 5063 to 6361) AUTHORS Abraham,S., Kienzle,T.E., Lapps,W. and Brian,D.A. TITLE Sequence and expression analysis of potential nonstructural proteins of 4.9, 4.8, 12.7 and 9.5 kilodaltons encoded between the spike and integral membrane protein genes of the bovine coronavirus JOURNAL Unpublished (1989) Univ. TN, Dept. of Microbiology, Knoxville, TN. STANDARD full staff_review COMMENT Draft entry and computer-readable sequence for [1,2 and 3] kindly submitted by D.A.Brian, 11-DEC-1989. FEATURES from to/span description pept 16 1290 hemagglutinin-esterase protein precursor sigp 16 69 hemagglutinin-esterase protein signal peptide matp 70 1287 hemagglutinin-esterase protein pept 1305 5396 spike protein precursor sigp 1305 1355 spike protein signal matp 1356 5393 spike protein pept 5386 5517 nonstructural protein 4.9 kD pept 5553 5690 nonstructural protein 4.8 kD pept 5774 6103 nonstructural protein 12.7 kD pept 6090 6344 nonstructural protein 9.5 kD pept 6359 > 6361 integral membrane protein ORF 107 517 ORF 1 ORF 977 1228 ORF 2 mRNA 5063 > 5517 4.9 kD protein mRNA mRNA 5693 > 6103 12.7 kD protein mRNA mRNA 5961 > 6344 9.5 kD protein mRNA site 2055 2056 internal proteolytic cleavage site BASE COUNT 1742 a 1028 c 1238 g 2353 t ORIGIN 1 ctaaactcag tgaaaatgtt tttgcttctt agatttgttc tagttagctg cataattggt 61 agcctaggtt ttgataaccc tcctaccaat gttgtttcgc atttaaatgg agattggttt 121 ttatttggtg acagtcgttc agattgtaat catgttgtta ataccaaccc ccgtaattat 181 tcttatatgg accttaatcc tgccctgtgt gattctggta aaatatcatc taaagctggc 241 aactccattt ttaggagttt tcactttacc gatttttata attacacagg cgaaggtcaa 301 caaattattt tttatgaggg tgttaatttt acgccttatc atgcctttaa atgcaccact 361 tctggtagta atgatatttg gatgcagaat aaaggcttgt tttacactca ggtttataag 421 aatatggctg tgtatcgcag ccttactttt gttaatgtac catatgttta taatggctct 481 gcacaatcta cagctctttg taaatctggt agtttagttc ttaataaccc tgcatatata 541 gctcgtgaag ctaattttgg ggattattat tataaggttg aagctgactt ttatttgtca 601 ggttgtgacg agtatatcgt accactttgt atttttaacg gcaagttttt gtcgaataca 661 aagtattatg atgatagtca atattatttt aataaagaca ctggtgttat ttatggtctc 721 aattctactg aaaccattac cactggtttt gattttaatt gtcattattt agttttaccc 781 tctggtaatt atttagccat ttcaaatgag ctattgttaa ctgttcctac gaaagcaatc 841 tgtcttaaca agcgtaagga ttttacgcct gtacaggttg ttgattcacg gtggaacaat 901 gccaggcagt ctgataacat gacggcggtt gcttgtcaac ccccgtactg ttattttcgt 961 aattctacta ccaactatgt tggtgtttat gatatcaatc atggggatgc tggttttact 1021 agcatactca gtggtttgtt atatgattca ccttgttttt cgcagcaagg tgtttttagg 1081 tatgataatg ttagcagtgt ctggcctctc tattcctatg gcagatgccc tactgctgct 1141 gatattaata cccctgatgt acctatttgt gtgtatgatc cgctaccact tattttgctt 1201 ggcatccttt tgggtgttgc ggtcataatt attgtagttt tgttgttata ttttatggtg 1261 gataatggta ctaggctgca tgatgcttag accataatct aaacatgttt ttgatacttt 1321 taatttcctt accaatggct tttgctgtta taggagattt aaagtgtact acggtttcca 1381 ttaatgatgt tgacaccggt gctccttcta ttagcactga tattgtcgat gttactaatg 1441 gtttaggtac ttattatgtt ttagatcgtg tgtatttaaa tactacgttg ttgcttaatg 1501 gttactaccc tacttcaggt tctacatatc gtaatatggc actgaaggga actttactat 1561 tgagcagact atggtttaaa ccaccttttc tttctgattt tattaatggt atttttgcta 1621 aggtcaaaaa taccaaggtt attaaaaagg gtgtaatgta tagtgagttt cctgctataa 1681 ctataggtag tacttttgta aatacatcct atagtgtggt agtacaacca catactacca 1741 atttggataa taaattacaa ggtctcttag agatctctgt ttgccagtat actatgtgcg 1801 agtacccaca tacgatttgt catcctaatc tgggtaataa acgcgtagaa ctatggcatt 1861 gggatacagg tgttgtttcc tgtttatata agcgtaattt cacatatgat gtgaatgctg 1921 attacttgta tttccatttt tatcaagaag gtggtacttt ttatgcatat tttacagaca 1981 ctggtgttgt tactaagttt ctgtttaatg tttatttagg cacggtgctt tcacattatt 2041 atgtcctgcc tttgacttgt tctagtgcta tgactttaga atattgggtt acacctctca 2101 cttctaaaca atatttacta gctttcaatc aagatggtgt tatttttaat gctgttgatt 2161 gtaagagtga ttttatgagt gagattaagt gtaaaacact atctatagca ccatctactg 2221 gtgtttatga attaaacggt tacactgttc agccaattgc agatgtttac cgacgtatac 2281 ctaatcttcc cgattgtaat atagaggctt ggcttaatga taagtcggtg ccctctccat 2341 taaattggga acgtaagacc ttttcaaatt gtaattttaa tatgagcagc ctgatgtctt 2401 ttattcaggc agactcattt acttgtaata atattgatgc tgctaagata tatggtatgt 2461 gtttttccag cataactata gataagtttg ctatacccaa tggtaggaag gttgacctac 2521 aattgggcaa tttgggctat ttgcagtctt ttaactatag aattgatact actgctacaa 2581 gttgtcagtt gtattataat ttacctgctg ctaatgtttc tgttagcagg tttaatcctt 2641 ctacttggaa taggagattt ggttttacag aacaatttgt ttttaagcct caacctgtag 2701 gtgtttttac tcatcatgat gttgtttatg cacaacattg ttttaaagct ccctcaaatt 2761 tctgtccgtg taaattggat gggtctttgt gtgtaggtaa tggtcctggt atagatgctg 2821 gttataaaaa tagtggtata ggcacttgtc ctgcaggtac taattattta acttgccata 2881 atgctgccca atgtaattgt ttgtgcactc ccgaccccat tacatctaaa tctacagggc 2941 cttacaagtg cccccaaact aaatacttag ttggcatagg tgagcactgt tcgggtcttg 3001 ctattaaaag tgattattgt ggaggtaatc cttgtacttg ccaaccacaa gcatttttgg 3061 gctggtctgt tgactcttgt ttacaagggg ataggtgtaa tatttttgct aattttattt 3121 tgcatgatgt taatagtggt actacttgtt ctactgattt acaaaaatca aacacagaca 3181 taattcttgg tgtttgtgtt aattatgatc tttatggtat tacaggccaa ggtatttttg 3241 ttgaggttaa tgcgacttat tataatagtt ggcagaacct tttatatgat tctaatggta 3301 atctctatgg ttttagagac tacttaacaa acagaacttt tatgattcgt agttgctata 3361 gcggtcgtgt ttcagcggcc tttcatgcta actcttccga accagcattg ctatttcgga 3421 atattaaatg caattacgtt tttaataata ctctttcacg acagctgcaa cctattaact 3481 attttgatag ttatcttggt tgtgttgtca atgctgataa tagtacttct agtgttgttc 3541 aaacatgtga tctcacagta ggtagtggtt actgtgtgga ttactctaca aaaagacgaa 3601 gtcgtagagc gattaccact ggttatcggt ttactacttt tgagccattt actgttaatt 3661 cagtaaatga tagtttagaa cctgtaggtg gtttgtatga aattcaaata ccttcagagt 3721 ttactatagg taatatggag gagtttattc aaacaagctc tcctaaagtt actattgatt 3781 gttctgcttt tgtctgtggt gattatgcag catgtaaatc acagttggtt gaatatggta 3841 gcttctgtga caatattaat gctatactca cagaagtaaa tgaactactt gacactacac 3901 agttgcaagt agctaatagt ttaatgaatg gtgtcactct tagcactaag cttaaagatg 3961 gcgttaattt caatgtagac gacatcaatt tttcccctgt attaggttgt ttaggaagcg 4021 attgtaataa agtttccagc agatctgcta tagaggattt acttttttct aaagtaaagt 4081 tatctgatgt cggtttcgtt gaggcttata ataattgtac tggaggtgcc gaaattaggg 4141 acctcatttg tgtgcaaagt tataatggta tcaaagtgtt gcctccactg ctctcagtaa 4201 atcagatcag tggatacact ttggctgcca cctctgctag tctgtttcct cctttgtcag 4261 cagcagtagg tgtaccattt tatttaaatg ttcagtatcg tattaatggg attggtgtta 4321 ccatggatgt gttaagtcaa aatcaaaagc ttattgctaa tgcatttaac aatgctcttg 4381 atgctattca ggaagggttt gatgctacca attctgcttt agttaaaatt caagctgttg 4441 ttaatgcaaa tgctgaagct cttaataact tattgcaaca actctctaat agatttggtg 4501 ctataagttc ttctttacaa gaaattctat ctagactgga tgctcttgaa gcgcaagctc 4561 agatagacag acttattaat gggcgtctta ccgctcttaa tgtttatgtt tctcaacagc 4621 ttagtgattc tacactagta aaatttagtg cagcacaagc tatggagaag gttaatgaat 4681 gtgtcaaaag ccaatcatct aggataaatt tttgtggtaa tggtaatcat attatatcat 4741 tagtgcagaa tgctccatat ggtttgtatt ttatccactt tagctatgtc cctactaagt 4801 atgtcactgc gaaggttagt cccggtctgt gcattgctgg tgatagaggt atagccccta 4861 agagtggtta ttttgttaat gtaaataata cttggatgtt cactggtagt ggttattact 4921 accctgaacc cataactgga aataatgttg ttgttatgag tacctgtgct gttaactata 4981 ctaaagcgcc ggatgtaatg ctgaacattt caacacccaa cctccatgat tttaaggaag 5041 agttggatca atggtttaaa aaccaaacat cagtggcacc agatttgtca cttgattata 5101 taaatgttac attcttggac ctacaagatg aaatgaatag gttacaggag gcaataaaag 5161 ttttaaatca gagctacatc aatctcaagg acattggtac atatgagtat tatgtaaaat 5221 ggccttggta tgtatggctt ttaattggct ttgctggtgt agctatgctt gttttactat 5281 tcttcatatg ctgttgtaca ggatgtggga ctagttgttt taagatatgt ggtggttgtt 5341 gtgatgatta tactggacac caggagttag taattaaaac atcacatgac gactaagttc 5401 gtctttgatt tattggctcc tgacgatata ttacatccct tcaatcatgt gaagctaatt 5461 ataagaccca ttgaggtcga gcatattata atagctacca caatgcctgc tgtttagtgg 5521 gtactgtgtc ttatataact agtaaacctg taatgccaat ggctacaacc attgacggta 5581 cagattatac taatattatg cctagtactg tttctacaac agtttattta ggctgttcta 5641 taggtattga cactagcacc actggtttta cctgtttttc acggtactag ttccaaacca 5701 tattataatt taggtagacc ttataacttt aagcattatt aattgccaaa gtttctaagg 5761 tcacgcccta gtaatggaca tctggagacc tgagattaaa tatctccgtt atactaacgg 5821 ttttaatgtc tcagaattag aagatgcttg ttttaaattt aactataaat ttcctaaagt 5881 aggatattgt agagttccta gtcatgcttg gtgccgtaat caaggtagct tttgtgctac 5941 actcactctt tatggcaaat ccaaacatta tgataaatat tttggagtaa taactggttt 6001 tacagcattc gctaatactg tagaggaggc tgttaacaaa ctggttttct tagctgttga 6061 ctttattact tggcggagac aggagttaaa tgtttatggc tgatgcttat tttgcagaca 6121 ctgtgtggta tgtggggcaa ataattttta tagttgccat ttgtttattg gttataatag 6181 ttgtagtggc atttttggca acttttaaat tgtgtattca actttgcggt atgtgtaata 6241 ccttaggact gtccccttct atttatgtgt ttaatagagg taggcagttt tatgagtttt 6301 acaacgatgt aaaaccacca gttcttgatg tggatgacgt ttagttaatc caaacattat 6361 g // LOCUS DAREXTA 399 bp ss-mRNA PLN 15-MAR-1990 DEFINITION Carrot (D.carota) extensin mRNA, partial cds. ACCESSION M11221 KEYWORDS alternate splicing; extensin; extracellular matrix protein; glycoprotein. SOURCE Carrot (D.carota) root, wounded, cDNA to mRNA, clone pDC11. ORGANISM Daucus carota Eukaryota; Plantae; Embryobionta; Magnoliophyta; Magnoliopsida; Rosidae; Apiales; Apiaceae. REFERENCE 1 (bases 1 to 399) AUTHORS Chen,J. and Varner,J.E. TITLE Isolation and characterization of cDNA clones for carrot extensin and a proline-rich 33-kDa protein JOURNAL Proc. Natl. Acad. Sci. U.S.A. 82, 4399-4403 (1985) STANDARD full staff_review FEATURES from to/span description pept < 1 134 extensin (AA at 3) mRNA < 1 152 extensin mRNA (alt.) mRNA < 1 399 extensin mRNA (alt.) BASE COUNT 120 a 96 c 61 g 122 t ORIGIN Unreported. 1 caccaccaac accagtttac aagtacaagt ctccgccacc accaatgcac tctcccccac 61 caccagttta ctctccacca ccacccaaac atcactactc ctatacgtca cctcctcctc 121 ctcaccacta ctaataaaaa ctctccctaa aggacactga tggaggccaa cttaagaaga 181 tgaagtaaaa taatggcttg cacgatgatt ttggcgtttt tataaatatg ttgatcgaat 241 atgatatttt tgttgtattc tagtatgggt catttgcttt ttgtttgcga atgaataaca 301 tttacatatg catgtagcat cggggaacct cttgacctct agaaatagag gtttgttatt 361 gtggcattaa agcaattata tgaataagta tttctgtta // LOCUS DRORGMA51 69 bp ds-DNA INV 07-NOV-1984 DEFINITION d.melanogaster 28s rrna gene, clone dmra51. ACCESSION K01579 KEYWORDS 28S ribosomal RNA; ribosomal RNA; transposon. SOURCE drosophila melanogaster dna, clone dmra51. ORGANISM Drosophila melanogaster Eukaryota; Animalia; Metazoa; Arthropoda; Uniramia; Insecta; Pterygota; Neoptera; Holometabola; Diptera; Brachycera; Cyclorrhapha; Schizophora; Drosophiloidea; Drosophilidae. REFERENCE 1 (bases 1 to 69) AUTHORS Rae,P.M.M. TITLE coding region deletions associated with the major form of rdna interruption in drosophila melanogaster JOURNAL Nucleic Acids Res. 9, 4997-5010 (1981) STANDARD full staff_review COMMENT [1] also sequenced additional clones at and around the termini of 5 kb type i insertions. comparisons were made with sequences of d.virilis and of the shorter type i insertions and their flanks. [1] suggests that drosophila rdna interruptions arose as a transposable element, and that divergence has included length alterations generated by unequal crossing over. dmra51 represents an interruption-free 28s rrna coding region. FEATURES from to/span description rRNA < 1 > 69 28s rrna BASE COUNT 21 a 19 c 14 g 15 t ORIGIN fnudii site. 1 cgcgcatgaa tggattaacg agattcctac tgtccctatc tactatctag cgaaaccaca 61 gccaaggga // LOCUS ECOSERB 2011 bp ds-DNA BCT 15-MAR-1990 DEFINITION E.coli phosphoserine phosphatase (serB) and smp protein genes, complete cds., and ORF, 5' end. ACCESSION X03046 M30784 KEYWORDS phosphatase; phosphoserine phosphatase; serB gene. SOURCE E.coli (strain K-12) DNA, clones pGS143-[7,2]. ORGANISM Escherichia coli Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Facultatively anaerobic rods; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 2011) AUTHORS Neuwald,A.F. and Stauffer,G.V. TITLE DNA sequence and characterization of the Escherichia coli serB gene JOURNAL Nucleic Acids Res. 13, 7025-7039 (1985) STANDARD simple staff_review REFERENCE 2 (bases 1062 to 2011) AUTHORS Neuwald,A.F. and Stauffer,G.V. TITLE An Escherichia coli membrane protein with a unique signal sequence JOURNAL Gene 82, 219-228 (1989) STANDARD simple staff_review COMMENT EMBL features not translated to GenBank features: key from to description TRANSCR 38 <1 (c) primary transcript RBS 38 31 (c) put. ribosome binding site PRM 49 45 (c) pot. -10 region (URF) PRM 71 64 (c) pot. -35 region (URF) PRM 61 68 pot. -35 region (serB) PRM 86 93 pot. -10 region SITE 101 101 transcription start site RBS 123 127 put. ribosome binding site INVREP 1103 1113 inverted repeat INVREP 1120 1129 inverted repeat SITE 1103 1129 pot. stem loop structure RBS 1137 1144 put. ribosome binding site FEATURES from to/span description pept 1029 61 (c) phosphoserine phosphatase (EC 3.1.3.3) pept 1135 1779 Smp protein ORF 1807 > 2011 ORF mRNA 1061 < 1 (c) PSP mRNA mRNA 1125 > 2011 Smp mRNA BASE COUNT 460 a 576 c 536 g 439 t ORIGIN About 99.6 min on K12 map. 1 agcttttgcc acggtgtacc tcgttaatgc tgtgccgccc gcaggatggc gggcgagcaa 61 ttacttctga ttcaggctgc ctgagaggat gcagaatacc cccatcaggt cagcgtgacg 121 gatggtgact tccgcctttt cattcacttt tggcttggca tggtaggcaa tccccagccc 181 tgccgctttg atcatcggca ggtcattggc tccatcgcca atcgccacgg tctgcgccag 241 cgggatttca tactcctgcg cgaggcgagt cagagttttc gctttgtact gcgcgtctac 301 gatgtcgccg atcacattgc cggtaaattt accgtccatg atctccagtt cattggctac 361 cacggcggtc aggcgcagct tgtcgcgcag gtattcagca aagaaagtaa agccgccgga 421 ggcaatcgcc actttccagc ccagcgtttc cagcttgagc accagttgcg ttaagcctgg 481 catcagcggc agattttcac gcacctgttg cagaatattg gcgtcagcgc ctttcagcgt 541 cgccacacgg ctgcgcaggc tggcggtaaa atcgagttcg ccgcgcatcg cccgttcggt 601 tacttccgcc accatctcgc ccgttccggc cagtttggca atttcatcaa tacattcaat 661 ctggatggcg gtggagtcca tatccatcac cagcaaaccc ggcgtgcgca ggtgcgggat 721 tttccccagc ggggcgacat ccagctgcgc ttcgtgggcc aggcgtgtag cccgtgcggt 781 gagtgaacct gccagacgaa tcacctgata atcttccacg caccaggcgg caacaatcac 841 catcgccgca cccagtttgc tctggtattg ggtcagacgt tgtttatcca gcccacgacc 901 atacagcagc cagccgctac gacctgcgtg gtaatccagt ggcatcactt catcaccact 961 taatgaaaga ggcagacccg gccataaaga gacatcttca ggcaggtcgc accaggtaat 1021 gttaggcatt aaggctcctg taaaatcgtt cgaagcaggg aaaataacgc atgaggctac 1081 cttgtatcca ttgcttctgg caacattaag tctcaaattt tcaaagggtg gaagatggct 1141 cgcacaaaac tgaaattccg gctgcatcgg gcagtgattg tcctgttctg tcttgccttg 1201 ttagtggcgc tgatgcaggg agcgtcatgg tttagtcaaa accaccagcg acagcgtaat 1261 ccacagctgg aagaactggc ccgcaccctg gcgcgtcagg tgacgctgaa cgttgcaccg 1321 ctgatgcgta ccgactcacc ggatgaaaaa cgcattcagg cgatcctcga tcagttaacg 1381 gatgaaagcc gtatcctcga cgcgggtgtg tatgacgaac aaggcgatct tatcgcacgt 1441 tctggcgaaa gcgtcgaagt gcgcgaccgg ctggcgctcg acggtaaaaa agcaggcggc 1501 tattttaacc agcagattgt cgagccaatt gcgggtaaaa acggaccgct cggctatctg 1561 cgcctgacac tcgacaccca tacgctcgcc accgaagccc aacaggtgga taacaccact 1621 aacattttac gcctgatgtt gctgctctca ctggcaatcg gtgtagtgct gacccgcacg 1681 ctgctacagg gtaaacgcac ccgctggcag caatcgccct tcctgttaac cgccagcaaa 1741 ccggtgccgg aagaggaaga aagcgagaaa aaagagtgac ccattactac aagaaaggaa 1801 atcgttatgt ccacattacg cctgctcatc tctgactctt acgacccgtg gtttaacctg 1861 gcggtggaag agtgtatttt tcgccaaatg cccgccacgc agcgcgttct gtttctctgg 1921 cgcaatgccg acacggtagt aattggtcgc gcgcagaacc cgtggaaaga gtgtaatacc 1981 cggcggatgg aagaagataa cgtccgcctg g // LOCUS GCRRSAGAA 164 bp ds-DNA PRI 16-JUN-1986 DEFINITION Galago crassicaudatus Alu repeat type I (GAL 10). ACCESSION X00114 KEYWORDS Alu repetitive sequence; repetitive sequence. SOURCE Galago crassicaudatus DNA. ORGANISM Galago crassicaudatus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Strepsirhini; Galagidae. REFERENCE 1 (bases 1 to 164) AUTHORS Daniels,G.R., Fox,G.M., Loewensteiner,D., Schmid,C.W. and Deininger,P.L. TITLE Species-specific homogeneity of the primate Alu family of repeated DNA sequences JOURNAL Nucleic Acids Res. 11, 7579-7593 (1983) STANDARD simple staff_review FEATURES from to/span description rpt 1 164 Alu repeat BASE COUNT 54 a 28 c 41 g 41 t ORIGIN 148 bp upstream of AluI site. 1 aattttgaag tcctaggcat ggtggctcat gcatgtaatc ctagcattct gggagaccaa 61 ggtgagtaga ttgcttgagt tcaggagttt gaaaccaacc tgagcaagaa tgagaccctc 121 atctttacta aaaatagaaa aaaaaaagct gggcatggtg gtac // LOCUS HSSATR12 239 bp ds-DNA VRL 15-SEP-1989 DEFINITION Herpesvirus saimiri DNA with an "at"-rich open reading frame, clone 12. ACCESSION M18993 KEYWORDS open reading frame. SEGMENT 12 of 14 SOURCE Herpesvirus saimiri (strain 11) DNA, clone 12. ORGANISM Herpesvirus saimiri Viridae; ds-DNA enveloped viruses; Herpesviridae; Gammaherpesviridae. REFERENCE 1 (bases 1 to 239) AUTHORS Gompels,U.A., Craxton,M.A. and Honess,R.W. TITLE Conservation of gene organization in the lymphotropic herpesviruses herpesvirus saimiri and Epstein-Barr virus JOURNAL J. Virol. 62, 757-767 (1988) STANDARD simple staff_review FEATURES from to/span description ORF > 239 < 1 (c) ORF1 (AA at 239) BASE COUNT 83 a 35 c 46 g 75 t ORIGIN About 450 bp after segment 11. 1 atggcagctg gtaagcatac tgcataagct aaaagcaata caataggaga ttttataact 61 gacgtctgaa gatatcttac agcatctaaa attctgttat tagaagcaat tagataacac 121 acttgcttta caatgtgatc ttgtggagtt gtttgcatta ggcaatgaca gaaaggtaac 181 atttttatct tatgtacaag ctgagacagt aaagctgaac ttgtgtctaa aattatgtc // LOCUS HUMA1ATR 1346 bp ss-mRNA PRI 16-JUN-1986 DEFINITION Human alpha-1-antitrypsin mRNA, complete cds. ACCESSION X01683 V00496 KEYWORDS antitrypsin. SOURCE Human liver, cDNA to mRNA. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 1346) AUTHORS Rosenberg,S., Barr,P.J., Najarian,R.C. and Hallewell,R.A. TITLE Synthesis in yeast of a functional oxidation-resistant mutant of human alpha-1-antitrypsin JOURNAL Nature 312, 77-80 (1984) STANDARD simple staff_review COMMENT [1] created expression vectors for the synthesis of normal and mutant alpha-1-antitrypsin proteins. The mutant protein was resistant to oxidative inactivation. FEATURES from to/span description pept 14 1270 alpha-1-antitrypsin precursor /nomgen="PI" /map="14q32,1" /hgml_locus_uid="LX0001X" sigp 14 85 alpha-1-antitrypsin signal peptide matp 86 1267 alpha-1-antitrypsin mRNA < 1 1346 a-1-at mRNA BASE COUNT 350 a 384 c 321 g 291 t ORIGIN 88 bp upstream of BamHI site. 1 agggtaatcg acaatgccgt cttctgtctc gtggggcatc ctcctgctgg caggcctgtg 61 ctgcctggtc cctgtctccc tggctgagga tccccaggga gatgctgccc agaagacaga 121 tacatcccac catgatcagg atcacccaac cttcaacaag atcaccccca acctggctga 181 gttcgccttc agcctatacc gccagctggc acaccagtcc aacagcacca atatcttctt 241 ctccccagtg agcatcgcta cagcctttgc aatgctctcc ctggggacca aggctgacac 301 tcacgatgaa atcctggagg gcctgaattt caacctcacg gagattccgg aggctcagat 361 ccatgaaggc ttccaggaac tcctccatac cctcaaccag ccagacagcc agctccagct 421 gaccaccggc aatggcctgt tcctcagcga gggcctgaag ctagtggata agtttttgga 481 ggatgttaaa aagttgtacc actcagaagc cttcactgtc aacttcgggg acaccgaaga 541 ggccaagaaa cagatcaacg attacgtgga gaagggtact caagggaaaa ttgtggattt 601 ggtcaaggag cttgacagag acacagtttt tgctctggtg aattacatct tctttaaagg 661 caaatgggag agaccctttg aagtcaagga caccgaggaa gaggacttcc acgtggacca 721 ggtgaccacc gtgaaggtgc ctatgatgaa gcgtttaggc atgtttaaca tccagcactg 781 taagaagctg tccagctggg tgctgctgat gaaatacctg ggcaatgcca ccgccatctt 841 cttcctgcct gatgagggga aactacagca cctggaaaat gaactcaccc acgatatcat 901 caccaagttc ctggaaaatg aagacagaag gtctgccagc ttacatttac ccaaactgtc 961 cattactgga acctatgatc tgaagagcat cctgggtcaa ctgggcatca ctaaggtctt 1021 cagcaatggg gctgacctct ccggggtcac agaggaggca cccctgaagc tctccaaggc 1081 cgtgcataag gctgtgctga ccatcgacga gaaagggact gaagctgctg gggccatgtt 1141 tttagaggcc atacccatgt ctatcccccc cgaggtcaag ttcaacaaac cctttgtctt 1201 cttaatgatt gaacaaaata ccaagtctcc cctcttcatg ggaaaagtgg tgaatcccac 1261 ccaaaaataa ctgcctctcg ctcctcaacc cctcccctcc atccctggcc ccctccctgg 1321 atgacattaa agaagggttg agctgg // LOCUS HUMCYAR04 998 bp ds-DNA PRI 15-MAR-1990 DEFINITION Human aromatase cytochrome P-450 gene, exon 4. ACCESSION M30798 J05105 KEYWORDS aromatase cytochrome P-450. SEGMENT 4 of 10 SOURCE Human DNA. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 998) AUTHORS Means,G.D., Mahendroo,M.S., Corbin,C.J., Mathis,J.M., Powell,F.E., Mendelson,C.R. and Simpson,E.R. TITLE Structural analysis of the gene encoding human aromatase cytochrome P-450, the enzyme responsible for estrogen biosynthesis JOURNAL J. Biol. Chem. 264, 19385-19391 (1989) STANDARD simple staff_entry FEATURES from to/span description pept + 448 + 602 aromatase cytochrome P-450, exon 4 pre-msg < 1 > 998 CYAR mRNA and introns IVS < 1 447 CYAR intron C IVS 603 > 998 CYAR intron D BASE COUNT 266 a 209 c 222 g 301 t ORIGIN About 10 kb after segment 3. 1 aaactgaact tagtaatcca caacttgatc ttggctcact cttatgtggc actttgatag 61 ctggatgata cctggaggct gagttcttct ggctctgtgt atggagggtt cagcttctta 121 gcagcctgca gctctgactc catgtagtgg gtagtatttt ggtgaggtta gagaagaaag 181 agaagtaagg aatggtatat ggtgtgtgtg tgtgttttct atctttgtgg atagttgaaa 241 cagggagtca gtaggtagaa atgattcaac tggagaagga tgtcctcatg ctaacagaag 301 tgcttattca acccgaatac tagaaatcat gcacattgca tttggagcaa catgcatttg 361 ctaagagagc tgcctcctag tcaaaatgta ccaccaggag ttctcctgac ccttaaaaat 421 tgacaccaaa gtttcctgtc ttttcaggtc ctcaagtatg ttccacataa tgaagcacaa 481 tcattacagc tctcgattcg gcagcaaact tgggctgcag tgcatcggta tgcatgagaa 541 aggcatcata tttaacaaca atccagagct ctggaaaaca actcgaccct tctttatgaa 601 aggtaagcag gtacttagtt agctacaatc ttcttttttg tctatgaatg tgcctttttt 661 gaaatcatat ttttaaaata ttttatttat ttatttattt atttatttat ttattgagac 721 aggctctgac tctatcaccc aggcaggagt gaccttggct cacgaccttg gctcactgta 781 acctccgcct cccaggactc aggcgattct cccacctcag cctcgcgagt agctgggact 841 acagaatgtg caccaccaat gcctggacaa atttttgtat tttttgtagc gatggggttt 901 cgccatattt gagaccaggc tgggctcaaa cacacggagt caaacgattg acctgcctcg 961 gcctcccaaa gtgttgggat tacagggatg agctatca // LOCUS HUMEL08 226 bp ds-DNA PRI 15-MAR-1990 DEFINITION Human elastin gene, exon 12. ACCESSION M17271 J02948 KEYWORDS alternative splicing; elastin. SEGMENT 8 of 20 SOURCE Human fetal aorta, cDNA to mRNA, clones cHEL[2,3,4]; DNA, clones HEL[1,2,3]. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 226) AUTHORS Indik,Z., Yeh,H., Ornstein-Goldstein,N., Sheppard,P., Anderson,N., Rosenbloom,J.C., Peltonen,L. and Rosenbloom,J. TITLE Alternative splicing of human elastin mRNA indicated by sequence analysis of cloned genomic and complementary DNA JOURNAL Proc. Natl. Acad. Sci. U.S.A. 84, 5680-5684 (1987) STANDARD full staff_review COMMENT The alternate products elastin A, B, and C given in the Features table correspond to cDNA clones cHEL4, cHEL3 and cHEL2 respectively. Entry awaiting author review 18-JAN-1990. FEATURES from to/span description pept + 55 + 216 elastin A precursor, exon 6 /nomgen="ELN" /map="2q31-qter" /hgml_locus_uid="LC0146D" pep$ + 55 + 216 elastin B precursor, exon 7 pep$ + 55 + 216 elastin C precursor, exon 7 matp + 55 + 216 elastin A matp + 55 + 216 elastin B matp + 55 + 216 elastin C pre-msg < 1 > 216 elastin mRNA and intron IVS < 1 54 elastin A intron E IVS 217 > 226 elastin A intron F IVS < 1 54 elastin B intron F IVS 217 > 226 elastin B intron G IVS < 1 54 elastin C intron F IVS 217 > 226 elastin C intron G BASE COUNT 19 a 51 c 82 g 74 t ORIGIN About 2.1 kb after segment 7. 1 tctgtcctct ttgatcaggt cttggttaat gatcagctct tctcaatctt gcagggttag 61 ttcctggtgt cggcgtggct cctggagttg gcgtggctcc tggtgtcggt gtggctcctg 121 gagttggctt ggctcctgga gttggcgtgg ctcctggagt tggtgtggct cctggcgttg 181 gcgtggctcc cggcattggc cctggtggag ttgcaggtga gtttca // LOCUS HUMERVKCP 290 bp ds-DNA PRI 08-APR-1987 DEFINITION Human endogenous retrovirus HERV-K8 pol and envelope ORF region. ACCESSION K03499 KEYWORDS endogenous retrovirus; env gene; pol protein. SOURCE Human fetal liver DNA, clone HERV-K8. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 290) AUTHORS Ono,M., Yasunaga,T., Miyata,T. and Ushikubo,H. TITLE Nucleotide sequence of human endogenous retrovirus genome related to the mouse mammary tumor virus genome JOURNAL J. Virol. 60, 589-598 (1986) STANDARD full staff_review COMMENT Draft entry and clean copy sequence for [1] kindly provided by M.Ono, 22-JAN-1987. FEATURES from to/span description ORF < 1 191 pol ORF (AA at 3) ORF 110 > 290 env ORF BASE COUNT 89 a 73 c 69 g 59 t ORIGIN 137 bp upstream of XbaI site. 1 cgcaatcgag caccgttgac tcacaagatg aacaaaatgg tgacgtcaga agaacagatg 61 aagttgccat ccaccaagaa agcagagccg ccgacttggg cacaactaaa gagctgacgc 121 agttagctac aaaatatcta gagaacacaa aggtgacaca aaccccagag agtatgctgc 181 ttgcagcttg atgattgtat caatggtggt aagtctccct atgcctgcag gagcagctgc 241 agctaactat acctactggg cctatgtgcc tttcccgccc ttaattcggg // LOCUS HUMMHC4G 81 bp ss-mRNA PRI 01-JUN-1984 DEFINITION human complement fourth component (c4) gamma chain (codons 1-27). ACCESSION K00830 KEYWORDS C4 component complement protein; C4 component complement protein gamma chain; antigen; complement protein; histocompatibility antigen; major histocompatibility complex; serum glycoprotein. SOURCE human (white male adult: hla a1,2; cw3,w6; b15,w39; complotype bf fs; c2c, c4a4,4; c4b2, qo) cdna to liver mrna, clone pc4al1. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 81) AUTHORS Whitehead,A.S., Goldberger,G., Woods,D.E., Markham,A.F. and Colten,H.R. TITLE use of a cdna clone for the fourth component of human complement (c4) for analysis of a genetic deficiency of c4) in guinea pig JOURNAL Proc. Natl. Acad. Sci. U.S.A. 80, 5387-5391 (1983) STANDARD full staff_review COMMENT genetically deficient guinea pigs show the total absence of pro-c4 and c4 peptides and mature c4 mrna. however, there is a higher molecular weight putative precursor, consistent with the idea that the c4 deficiency in guinea pigs is due to a defect in post-transcriptional processing [1]. FEATURES from to/span description pept < 1 > 81 c4 propeptide (aa at 1) matp 19 > 81 c4 gamma chain BASE COUNT 17 a 20 c 35 g 9 t ORIGIN 1 aggaaccgcc gcaggaggga ggcgcccaag gtggtggagg agcaggagtc cagggtgcac 61 tacaccgtgt gcatctggcg g // LOCUS LISHLYA 3550 bp ds-DNA BCT 15-MAR-1990 DEFINITION L.monocytogenes listeriolysin O (hlyA) gene, complete cds. ACCESSION M24199 M29030 KEYWORDS listeriolysin. SOURCE L.monocytogene (strain L028) DNA, clone pCL102. ORGANISM Listeria monocytogenes Prokaryota; Bacteria; Firmicutes; Regular asporogenous rods; Lactobacillaceae. REFERENCE 1 (bases 1360 to 3078) AUTHORS Mengaud,J., Vicente,M.-F., Chenevert,J., Pereira,J.M., Geoffroy,C., Gicquel-Sanzey,B., Baquero,F., Perez-Diaz,J.-C. and Cossart,P. TITLE Expression in Escherichia coli and sequence analysis of the listeriolysin O determinant of Listeria monocytogenes JOURNAL Infect. Immun. 56, 766-772 (1988) STANDARD full staff_entry REFERENCE 2 (bases 1 to 1591; 3076 to 3550) AUTHORS Mengaud,J., Vicente,M.F. and Cossart,P. TITLE Transcriptional mapping and nucleotide sequence of the Listeria monocytogenes hlyA region reveal structural features that may be involved in regulation JOURNAL Infect. Immun. (1989) In press STANDARD simple staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly provided by J.Mengaud, 24-APR-1989. Draft entry and computer-readable sequence for [2] kindly submitted by J.Mengaud 14-OCT-1989. FEATURES from to/span description pept 1489 3078 listeriolysin precursor sigp 1489 1563 listeriolysin signal peptide matp 1564 3075 listeriolysin ORF 1040 294 (c) ORFU ORF 3409 > 3550 ORFD (5' end put.) mRNA 1272 < 294 (c) ORFU mRNA mRNA 1356 > 3078 hlyA mRNA (alt.) mRNA 1367 > 3078 hlyA mRNA (alt.) mRNA 3259 > 3550 ORFD mRNA binding 1056 1051 (c) put. ORFU ribosome binding site binding 1473 1478 put. hlyA ribosome binding site binding 3395 3401 put. ORFD ribosome binding site signal 1307 1320 region of dyad symmetry (regulation of ORFU and hlyA) signal 3126 3203 put. transcription termination signal for hlyA BASE COUNT 1212 a 565 c 663 g 1110 t ORIGIN 1 bp upstream of EcoRI site. 1 gaattcttct gcttgagcgt tcatgtctca tcccccaatc gttttttatc gccctttttt 61 aaaataccct aaaaacatta ggcagtaaca acaattgtta gctgttgaaa gaaagtcacg 121 ctaaatgatg ttttttacat ataggatttt attatacaaa ttttgattcg caaaagaaat 181 gcatacatat ttaaaaacgg atttatttag atgttaaaat tgaaatagag ttagtatatg 241 gttccgaggt tgctcggaga tatactaacc cttttttgta ggaataatat atgttagttg 301 aatttattgt tttttatgat gtttttaatt gtttgttttt cggggaagtc catgattagt 361 atgcctaatc ctcgaacttt ttccgatgtt aagttgagta cgaattgctc tactttgttg 421 tttaatgctg cagcatactg acgaggtgtg aatgttaatg aagtggcgct aatatggtta 481 agaaaaagtt tattgtccgc tttggaagct tgataagcag tctggacaat ctctttgaat 541 tttgttttct cactcggacc attgtagtca tcttgaatta cttggttagg tgcgccgaac 601 tgcatgccga atttgcgtga gttaatgact aatggctttt ttgtgtggtt ctctgaaagt 661 aataatattt ttccgcggac atcttttaat gtagggattt tattgctcgt gtcagttctg 721 ggagtagtgt aaaaataatc tttataaatg ttgattagtg gttggatccg ataatcaaaa 781 ctatcgttgc tgttttgctc gtcttttaaa cgcataataa tggtttcttt tggatttttc 841 tttaaaaatt gagtaatcgt ttctaataca cctgaaagtg atgcatttaa aaaaattggc 901 ccatggtaaa tgttgagatt gtcttttgct ctaatatcga tgtaccgtat tcctgcttct 961 agttgttggt acaatgacat cgtttgtgtt tgagctagtg gtttggttaa tgtccatgtt 1021 atgtctccgt tatagctcat cgtatcatgt gtacctggta tagagagcgc tgctaggttt 1081 gttgtgtcag gtagagcgga catccattgt tttgtagtta cagagttctt tattggctta 1141 ttccagttat taagcgaata tgcttttccg cctaatggga aagtaaaaaa gtataaaata 1201 aaacagagta ataaaactaa tgtgcgttgc aaataattct tatacaaaat ggccccctcc 1261 tttgattagt atattcctat cttaaagtga cttttatgtt gaggcattaa catttgttaa 1321 cgacgataaa gggacagcag gactagaata aagctataaa gcaagcatat aatattgcgt 1381 ttcatcttta gaagcgaatt tcgccaatat tataattatc aaaagagagg ggtggcaaac 1441 ggtatttggc attattaggt taaaaaatgt agaaggagag tgaaacccat gaaaaaaata 1501 atgctagttt ttattacact tatattagtt agtctaccaa ttgcgcaaca aactgaagca 1561 aaggatgcat ctgcattcaa taaagaaaat tcaatttcat ccatggcacc accagcatct 1621 ccgcctgcaa gtcctaagac gccaatcgaa aagaaacacg cggatgaaat cgataagtat 1681 atacaaggat tggattacaa taaaaacaat gtattagtat accacggaga tgcagtgaca 1741 aatgtgccgc caagaaaagg ttacaaagat ggaaatgaat atattgttgt ggagaaaaag 1801 aagaaatcca tcaatcaaaa taatgcagac attcaagttg tgaatgcaat ttcgagccta 1861 acctatccag gtgctctcgt aaaagcgaat tcggaattag tagaaaatca accagatgtt 1921 ctccctgtaa aacgtgattc attaacactc agcattgatt tgccaggtat gactaatcaa 1981 gacaataaaa tcgttgtaaa aaatgccact aaatcaaacg ttaacaacgc agtaaataca 2041 ttagtggaaa gatggaatga aaaatatgct caagcttatc caaatgtaag tgcaaaaatt 2101 gattatgatg acgaaatggc ttacagtgaa tcacaattaa ttgcgaaatt tggtacagca 2161 tttaaagctg taaataatag cttgaatgta aacttcggcg caatcagtga agggaaaatg 2221 caagaagaag tcattagttt taaacaaatt tactataacg tgaatgttaa tgaacctaca 2281 agaccttcca gatttttcgg caaagctgtt actaaagagc agttgcaagc gcttggagtg 2341 aatgcagaaa atcctcctgc atatatctca agtgtggcgt atggccgtca agtttatttg 2401 aaattatcaa ctaattccca tagtactaaa gtaaaagctg cttttgatgc tgccgtaagc 2461 ggaaaatctg tctcaggtga tgtagaacta acaaatatca tcaaaaattc ttccttcaaa 2521 gccgtaattt acggaggttc cgcaaaagat gaagttcaaa tcatcgacgg caacctcgga 2581 gacttacgcg atattttgaa aaaaggcgct acttttaatc gagaaacacc aggagttccc 2641 attgcttata caacaaactt cctaaaagac aatgaattag ctgttattaa aaacaactca 2701 gaatatattg aaacaacttc aaaagcttat acagatggaa aaattaacat cgatcactct 2761 ggaggatacg ttgctcaatt caacatttct tgggatgaag taaattatga tcctgaaggt 2821 aacgaaattg ttcaacataa aaactggagc gaaaacaata aaagcaagct agctcatttc 2881 acatcgtcca tctatttgcc aggtaacgcg agaaatatta atgtttacgc taaagaatgc 2941 actggtttag cttgggaatg gtggagaacg gtaattgatg accggaactt accacttgtg 3001 aaaaatagaa atatctccat ctggggcacc acgctttatc cgaaatatag taataaagta 3061 gataatccaa tcgaataatt gtaaaagtaa taaaaaatta agaataaaac cgcttaacac 3121 acacgaaaaa ataagcttgt tttgcactct tcgtaaatta ttttgtgaag aatgtagaaa 3181 caggcttatt ttttaatttt tttagaagaa ttaacaaatg taaaagaata tctgactgtt 3241 tatccatata atataagcat atcccaaagt ttaagccacc tatagtttct actgcaaaac 3301 gtataattta gttcccacat atactaaaaa acgtgtcctt aactctctct gtcagattag 3361 ttgtaggtgg cttaaactta gttttacgaa ttaaaaagga gcggtgaaat gaaaagtaaa 3421 cttatttgta tcatcatggt aatagctttt caggctcatt tcactatgac ggtaaaagca 3481 gattctgtcg gggaagaaaa acttcaaaat aatacacaag ccaaaaagac ccctgctgat 3541 ttaaaagctt // LOCUS MCFGP70 1538 bp ss-RNA VRL 12-MAR-1984 DEFINITION mink cell focus-forming 247 retrovirus gp70 gene. ACCESSION K00526 KEYWORDS env gene. SOURCE murine retrovirus mcf 247 isolated from akr mouse thymic lymphoma. ORGANISM Mink cell focus-forming virus Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; Retroviridae; Oncovirinae; Type C oncovirus group; Mammalian type C oncoviruses; Mink cell focus-forming viruses. REFERENCE 1 (bases 1 to 1538) AUTHORS Holland,C.A., Wozney,J. and Hopkins,N. TITLE nucleotide sequence of the gp70 gene of murine retrovirus mcf 247 JOURNAL J. Virol. 47, 413-420 (1983) STANDARD simple staff_review COMMENT compared with mo-mcf and parental akv env genes. see and . FEATURES from to/span description pept 219 > 1538 gp70 protein sigp 222 314 gp70 signal peptide matp 315 > 1538 gp70 mature peptide BASE COUNT 384 a 472 c 369 g 313 t ORIGIN 289 bp 5' to a hinf i site. 1 ctggaccagc cagtgatacc acaccccttc cgtgtcggcg acaccgtgtg ggtacgccgg 61 caccagacta agaacttgga acctcgctgg aaaggaccct acaccgtcct gctgaccacc 121 cccaccgctc tcaaagtaga cggcatcgct gcgtggatcc acgccgctca cgtaaaagcg 181 gcgacaaccc ctccggccgg aacagcatca ggaccgacat ggaaggtcca gcgttctcaa 241 aaccccttaa agataagatt aacccgtggg gccccctaat agtcctggga atcttaataa 301 gggcaggagt atcagtacga catgacagcc ctcatcaggt cttcaatgtt acttggagag 361 ttaccaactt aatgacagga caaacagcta atgctacctc cctcctgggg acaatgaccg 421 atgcctttcc taaactgtac tttgacttgt gcgatttaat aggggatgac tgggatgaga 481 ctggactcgg gtgtcgcact cccgggggaa gaaaaagggc aagaacattt gacttctatg 541 tttgccccgg gcatactgta ccaacagggt gtggagggcc gagagagggc tactgtggca 601 aatggggctg tgagaccact ggacaggcat actggaagcc atcatcatca tgggacctaa 661 tttcccttaa gcgaggaaac acccctcaga atcagggccc ctgttatgat tcctcagcgg 721 tctccagtga catcaagggc gccacaccgg ggggtcgatg caatccccta gtcctggaat 781 tcactgacgc gggcaaaaag gccagctggg atggccccaa agtatgggga ctaagactgt 841 accgatccac agggatcgac ccggtgaccc ggttctcttt gacccgccag gtcctcaata 901 tagggccccg cgtccccatt gggcctaatc ccgtgatcac tgaccagtta cccccctccc 961 gacccgtgca gatcatgctc cccaggcctc ctcagcctcc tcctccaggc gcagcctcta 1021 cagtccctga gactgcccca ccttctcaac aacctgggac gggagacagg ctgctaaatc 1081 tagtaaaagg agcctaccaa gccctcaacc tcaccagtcc tgataaaacc caagagtgct 1141 ggttatgcct agtatcggga cccccatact acgagggggt tgccgtccta ggtacctact 1201 ccaaccatac ttctgcccca gctaactgct ctgtggcctc tcaacacaaa ttgaccttgt 1261 ccgaagtgac cggacaggga ctctgcatag gagcggtccc taaaacccat caagccttgt 1321 gtaataccac ccaaaagaca agcgatgggt cctactattt ggccgctccc acaggaacta 1381 cctgggcttg tagtactgga cttactccct gtatctcaac caccatactt gacctcacca 1441 ccgattactg tgtcctggtc gagctttggc caagggtgac ctaccattcc cctagttatg 1501 tttaccacca atttgaaaga cgagccaaat ataaaaga // LOCUS MUSH 4300 bp ss-mRNA ROD 22-SEP-1986 DEFINITION Mouse CFh locus, protein H mRNA, complete cds. ACCESSION M12660 KEYWORDS complement protein H; protein H; serum glycoprotein. SOURCE Mouse (male, strain C57B10.WR) liver, cDNA to mRNA, clone MH[4,8]. ORGANISM Mus musculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 3425; 3474 to 4300) AUTHORS Kristensen,T. and Tack,B.F. TITLE Murine protein H is comprised of 20 repeating units, 61 amino acids in length JOURNAL Proc. Natl. Acad. Sci. U.S.A. 83, 3963-3967 (1986) STANDARD full staff_review REFERENCE 2 (bases 1 to 4300) AUTHORS Kristensen,T. and Tack,B.F. JOURNAL Unpublished (1986) Scripps Clinic and Res Found, La Jolla, CA 92037 STANDARD full staff_review COMMENT Draft entry and clean copy sequence for [1],[2] kindly provided by T.Kristensen, 28-JUL-1986. FEATURES from to/span description pept 101 3425 H protein precursor, exon 1 3474 3853 H protein precursor, exon 2 sigp 101 154 H protein signal peptide matp 155 3425 H protein 3474 3850 H protein pre-msg < 1 4300 H mRNA IVS 3426 3473 H cds intron BASE COUNT 1401 a 814 c 902 g 1183 t ORIGIN 554 bp upstream of XhoI site; chromosome 1. 1 aagtctttcc ctgctgtgac cacagttcat agcagagagg aactggatgg tacagcacag 61 atttctcttg gagtcagttg gtcccagaaa gatccaaatt atgagactgt cagcaagaat 121 tatttggctt atattatgga ctgtttgtgc agcagaagat tgtaaaggtc ctcctccaag 181 agaaaattca gaaattctct caggctcgtg gtcagaacaa ctatatccag aaggcaccca 241 ggctacctac aaatgccgcc ctggataccg aacacttggc actattgtaa aagtatgcaa 301 gaatggaaaa tgggtggcgt ctaacccatc caggatatgt cggaaaaagc cttgtgggca 361 tcccggagac acaccctttg ggtcctttag gctggcagtt ggatctcaat ttgagtttgg 421 tgcaaaggtt gtttatacct gtgatgatgg gtatcaacta ttaggtgaaa ttgattaccg 481 tgaatgtggt gcagatggct ggatcaatga tattccacta tgtgaagttg tgaagtgtct 541 acctgtgaca gaactcgaga atggaagaat tgtgagtggt gcagcagaaa cagaccagga 601 atactatttt ggacaggtgg tgcggtttga atgcaattca ggcttcaaga ttgaaggaca 661 taaggaaatt cattgctcag aaaatggcct ttggagcaat gaaaagccac gatgtgtgga 721 aattctctgc acaccaccgc gagtggaaaa tggagatggt ataaatgtga aaccagttta 781 caaggagaat gaaagatacc actataagtg taagcatggt tatgtgccca aagaaagagg 841 ggatgccgtc tgcacaggct ctggatggag ttctcagcct ttctgtgaag aaaagagatg 901 ctcacctcct tatattctaa atggtatcta cacacctcac aggattatac acagaagtga 961 tgatgaaatc agatatgaat gtaattatgg cttctatcct gtaactggat caactgtttc 1021 aaagtgtaca cccactggct ggatccctgt tccaagatgt accttgaaac catgtgaatt 1081 tccacaattc aaatatggac gtctgtatta tgaagagagc ctgagaccca acttcccagt 1141 atctatagga aataagtaca gctataagtg tgacaacggg ttttcaccac cttctgggta 1201 ttcctgggac taccttcgtt gcacagcaca agggtgggag cctgaagtcc catgcgtcag 1261 gaaatgtgtt ttccattatg tggagaatgg agactctgca tactgggaaa aagtatatgt 1321 gcagggtcag tctttaaaag tccagtgtta caatggctat agtcttcaaa atggtcaaga 1381 cacaatgaca tgtacagaga atggctggtc ccctcctccc aaatgcatcc gtatcaagac 1441 atgttcagca tcagatatac acattgacaa tggatttctt tctgaatctt cttctatata 1501 tgctctaaat agagaaacat cctatagatg taagcaggga tatgtgacaa atactggaga 1561 aatatcagga tcaataactt gccttcaaaa tggatggtca cctcaaccct catgcattaa 1621 gtcttgtgat atgcctgtat ttgagaattc tataactaag aatactagga catggtttaa 1681 gctcaatgac aaattagact atgaatgtct cgttggattt gaaaatgaat ataaacatac 1741 caaaggctct ataacatgta cttattatgg atggtctgat acaccctcat gttatgaaag 1801 agaatgcagt gttcccactc tagaccgaaa actagtcgtt tcccccagaa aagaaaaata 1861 cagagttgga gatttgttgg aattctcctg ccattcagga cacagagttg ggccagattc 1921 agtgcaatgc taccactttg gatggtctcc tggtttccct acatgtaaag gtcaagtagc 1981 atcatgtgca ccacctcttg aaattcttaa tggggaaatt aatggagcaa aaaaagttga 2041 atacagccat ggtgaagtgg tgaaatatga ttgcaaacct agattcctac tgaagggacc 2101 caataaaatc cagtgtgttg atgggaattg gacaaccttg cctgtatgta ttgaggagga 2161 gagaacatgt ggagacattc ctgaacttga acatggctct gccaagtgtt ctgttcctcc 2221 ctaccaccat ggagattcag tggagttcat ttgtgaagaa aacttcacaa tgattggaca 2281 tgggtcagtt tcttgcatta gtggaaaatg gacccagctt cctaaatgtg ttgcaacaga 2341 ccaactggag aagtgtagag tgctgaagtc aactggcata gaagcaataa aaccaaaatt 2401 gactgaattt acgcataact ccaccatgga ttacaaatgt agagacaagc aggagtacga 2461 acgctcaatc tgtatcaatg gaaaatggga tcctgaacca aactgtacaa gcaaaacatc 2521 ctgccctcct ccaccgcaga ttccaaatac ccaagtgatt gaaaccaccg tgaaatactt 2581 ggatggagaa aaattatctg ttctttgcca agacaattac ctaactcagg actcagaaga 2641 aatggtgtgc aaagatggaa ggtggcagtc attacctcgc tgcattgaaa aaattccatg 2701 ttcccagccc cctacaatag aacatggatc tattaattta cccagatctt cagaagaaag 2761 gagagattcc attgagtcca gcagtcatga acatggaact acattcagct atgtctgtga 2821 tgatggtttc aggatacctg aagaaaatag gataacctgc tacatgggaa aatggagcac 2881 tccacctcgc tgtgttggac ttccttgtgg acctccacct tcaattcctc ttggtactgt 2941 ttctcttgag ctagagagtt accaacatgg ggaagaggtt acataccatt gttctacagg 3001 ctttggaatt gatggaccag catttattat atgcgaagga ggaaagtggt ctgacccacc 3061 aaaatgcata aaaacggatt gtgacgtttt acccacagtt aaaaatgcca taataagagg 3121 aaagagcaaa aaatcatata ggacaggaga acaagtgaca ttcagatgtc aatctcctta 3181 tcaaatgaat ggctcagaca ctgtgacatg tgttaatagt cggtggattg gacagccagt 3241 atgcaaagat aattcctgtg tggatccacc acatgtgcca aatgctacta tagtaacaag 3301 gaccaagaat aaatatctac atggtgacag agtacgttat gaatgtaata aacctttgga 3361 actatttggg caagtggaag tgatgtgtga aaatgggata tggacagaaa aaccaaagtg 3421 ccgaggtctg taattcgact tgagtctcaa accttcaaat gttttctcct tagactcaac 3481 agggaaatgt gggcctcctc cacctattga caatggagac atcacctcct tgtcattacc 3541 agtatatgaa ccattatcat cagttgaata tcaatgccag aagtattatc tccttaaggg 3601 aaagaagaca ataacatgta caaatggaaa gtggtctgag ccaccaacat gcttacatgc 3661 atgtgtaata ccagaaaaca ttatggaatc acacaatata attctcaaat ggagacacac 3721 tgaaaagatt tattcccatt caggggagga tattgaattt ggatgtaaat atggatatta 3781 taaagcaaga gattcaccgc catttcgtac aaagtgcatt aatggcacca tcaattatcc 3841 cacttgtgta taaaatcata atacatttat tagttgattt tattgtttag aaaggcacat 3901 gcatgtgact aatatacttt caatttgcat tgaagtattg tttaactcat gtcttctcat 3961 aaatataaac atttttgtta tatggtgatt aacttgtaac tttaaaaact attgccaaaa 4021 tgcaaaagca gtaattcaaa actcctaatc taaaatatga tatgtccaag gacaaactat 4081 ttcaatcaag aaagtagatg taagttcttc aacatctgtt tctattcaga actttctcag 4141 attttcctgg ataccttttg atgtaaggtc ctgatttaca gtggataaag gatatattga 4201 ctgattcttc aaattaatat gatttcccaa agcatgtaac aaccaaacta tcatatatta 4261 tatgactaat gcatacaatt aattactata taatactttc // LOCUS MUSHOX161 443 bp ds-DNA ROD 15-MAR-1989 DEFINITION Mouse Hox-1.6 gene, exon 2. ACCESSION M20214 M15928 X06024 KEYWORDS homeo box. SEGMENT 1 of 3 SOURCE Mouse DNA, library of H.Lehrach, and embryo, cDNA to mRNA, clones 1 and 3. ORGANISM Mus musculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 443) AUTHORS Baron,A., Featherstone,M.S., Hill,R.E., Hall,A., Galliot,B. and Duboule,D. TITLE Hox-1.6: a mouse homeo-box-containing gene member of the Hox-1 complex JOURNAL EMBO J. 6, 2977-2986 (1987) STANDARD simple automatic FEATURES from to/span description pre-msg 20 > 443 Hox1.6 mRNA and introns IVS < 1 19 Hox1.6 intron A IVS 436 > 443 Hox1.6 intron B BASE COUNT 101 a 138 c 111 g 93 t ORIGIN 57 bp upstream of BamHI site. 1 cttctctggt cctatggagg aagtgagaaa gttggcacgg tcacccatgc ttcgcaggat 61 ccaatcactc agtgacagat ggacaatgca agaatgaact cctttctgga ataccccatc 121 cttggcagtg gcgactctgg gacctgctcg gcgcgagctt acccctctga ccatgggatt 181 acaactttcc aatcctgcgc ggtcagtgcc aacagctgcg gcggcgacga ccgcttccta 241 gtgggcaggg gggtgcagat cagctcgccc caccaccacc accaccacca ccaccatcac 301 cacccccaga cggctactta ccagacttct ggaaaccttg ggatttctta ttcccactcg 361 agttgtggtc caagctatgg cgcgcagaac ttcagtgcgc cttatggccc ctatggatta 421 aatcaggaag cagacgtaag tgg // LOCUS MUSIGHAU 527 bp ss-mRNA ROD 20-MAY-1987 DEFINITION Mouse Ig active mu-chain mutant C-region mRNA, exons 3 and 4, from mutant 102 derived from hybridoma PC7. ACCESSION M13680 KEYWORDS constant region; immunoglobulin; immunoglobulin heavy chain; mu-immunoglobulin; processed gene. SOURCE Mouse mutant hybridoma 102 (derived from wild type hybridoma PC7) cDNA to mRNA. ORGANISM Mus musculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 527) AUTHORS Baker,M.D., Wu,G.E., Toone,W.M., Murialdo,H., Davis,A.C. and Shulman,M.J. TITLE A region of the immunoglobulin-mu heavy chain necessary for forming pentameric IgM JOURNAL J. Immunol. 137, 1724-1728 (1986) STANDARD full staff_review COMMENT Draft entry and clean copy of the sequence in [1] were kindly provided by M.D.Baker, 04-NOV-1986. An in frame 39 bp deletion occurs between positions 357 and 358 which results in a mu-chain lacking amino acids 550-562, a region spanning the fourth constant domain and the tail of the mu-chain. This deletion blocks pentamer formation thus resulting in monomeric (noncytolytic) IgM. FEATURES from to/span description pept < 1 402 Ig mutant mu-chain, exons 3 and 4 of C-region (AA at 1) mRNA < 1 527 mIgM mRNA (alt.) mRNA < 47 527 mIgM mRNA (alt.) BASE COUNT 139 a 146 c 129 g 113 t ORIGIN 1 agggatctgc cttcaccaca gaagaaattc atctcaaaac ccaatgaggt gcacaaacat 61 ccacctgctg tgtacctgct gccaccagct cgtgagcaac tgaacctgag ggagtcagcc 121 acagtcacct gcctggtgaa gggcttctct cctgcagaca tcagtgtgca gtggcttcag 181 agagggcaac tcttgcccca agagaagtat gtgaccagtg ccccgatgcc agagcctggg 241 gccccaggct tctactttac ccacagcatc ctgactgtga cagaggagga atggaactcc 301 ggagagacct atacctgtgt tgtaggccac gaggccctgc cacacctggt gaccgagaat 361 gtctccctga tcatgtctga cacaggcggc acctgctatt gaccatgcta gcgctcaacc 421 aggcaggcct tgggtgtcca gttgctctgt gtatgcaaac taaccatgtc agagtgagat 481 gttgcatttt ataaaaatta gaaataaaaa aaatccattc aaacgtc // LOCUS MUSIGV31 652 bp ds-DNA ROD 30-SEP-1988 DEFINITION Mouse germline Ig H chain (subgroup VIII) pseudogene V31, exons 1 and 2. ACCESSION X03302 KEYWORDS immunoglobulin; immunoglobulin heavy chain; immunoglobulin heavy chain subgroup VIII; pseudogene; variable region. SOURCE Mouse (strain BALB/c) anti-bacterial lipopolysaccharide B-lymphocyte DNA, clone V31-a. ORGANISM Mus musculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 652) AUTHORS Winter,e., Radbruch,a. and Krawinkel,U. TITLE Members of novel V-H gene families are found in VDJ regions of polyclonally activated B-lymphocytes JOURNAL EMBO J. 4, 2861-2867 (1985) STANDARD simple automatic COMMENT library=lambda L47.1; EMBL features not translated to GenBank features: key from to description SITE 440 441 deletion site (1 bp) SITE 487 488 deletion site (1 codon) SITE 611 617 7-mer recombination signal SITE 641 649 9-mer recombination signal FEATURES from to/span description pept.ps 174 222 pseudo-Ig H chain, exon 1 304 / 610 pseudo-Ig H chain, exon 2 IVS 223 303 pseudo-Ig H chain intron A IVS 611 > 652 pseudo-Ig H chain intron B (no splice consensus at 611) BASE COUNT 161 a 185 c 130 g 176 t ORIGIN Chromosome 12. 1 gatcccttga gggcaccttt gcttctcatc aaagactggc tcttcctgag ttcaaaatct 61 taagacctcc attcaagtca gcacctcatg agacaatgca aatctctcca gactccaccc 121 aaccctacat tgaaatgcta actctgggcc tgagtaaaca ctggtgtgca gtcatgggca 181 gacttacatt ctcattcctg ctgctgctgc ctgtccctgc atgtgagtgc caagactccc 241 taaggaatga acttccatac cctcaccctc tttcaacctt accatctgca tttttctcca 301 cagatgtcct atctcaggtt actctgaaag tgtctggccc tgggatattg cagccatcac 361 agactctcgg cctggcctgt actttctctg ggatttcact gagtacttct ggtatgggtt 421 tgagctggct tcgtaagcct cagggaaggc tttagagtgg ctggcaagca tttggaataa 481 tgataactac tacaacccat ctttgaagag ccggctcaca atctccaagg agacctccaa 541 caaccaagta ttccttaaac tcaccagtgt ggacactgca gattctacca catactactg 601 tgcttggaga cacagtggtg caaccgtgac ccacagctgt gcaatatttc ag // LOCUS MUSSMGPP 568 bp ss-mRNA ROD 01-AUG-1985 DEFINITION Mouse sub-maxillary gland mRNA coding for a putative polypeptide. ACCESSION K02296 KEYWORDS . SOURCE Mouse (strain DBA/2) mature male submaxillary gland, cDNA to mRNA, clones pSMG166, pSMG172, pSMG143, pSMG173. ORGANISM Mus musculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 568) AUTHORS Windass,J.D., Mullins,J.J., Beecroft,L.J., George,H., Meacock,P.A., Williams,B.R.G. and Brammar,W.J. TITLE Molecular cloning of cDNAs from androgen-independent mRNA species of DBA/2 mouse sub-maxillary glands JOURNAL Nucleic Acids Res. 12, 1361-1376 (1984) STANDARD full staff_review FEATURES from to/span description pept 78 467 sub-maxillary polypeptide (putative) mRNA 1 658 sm pp mRNA BASE COUNT 182 a 109 c 109 g 167 t 1 others ORIGIN 91 bp upstream of Sau3A site. 1 atggtntggt gttctgactt ctccaatatg aagggtctct cattcacatt cagtgctgtg 61 acactcttct tggtcctatg tctgcagctg ggatcattga aagccaggat gatgaaaatg 121 tccgaaagcc acttttgatt gaaatagatg ttccatcaac agcacaggaa aatcaggaga 181 tcactgtgca agttacagtt gaaacacaat atagagaatg tatggtgatc aaagcttacc 241 ttgtaagtaa tgaaccaatg gaaggtgcct tcaactatgt acaaacccgc tgcctctgta 301 atgatcatcc tattagattt ttctgggata taataatcac aagaactgta acatttgcaa 361 cagtgattga tattgttcga gaaaagaata tctgtcctaa tgatatggca gtggtgccca 421 tcacagcaaa ccggtactat acttataata ctgtgcgaat gaattaacgg aggcttctta 481 tgcgtctttc aaaaccatta tcattttctc caagtacaat tgactggaat aaccattgta 541 gtcagtcaaa taaacactta acaagcat // LOCUS PH5I105IM 664 bp ds-DNA PHG 15-MAR-1990 DEFINITION Bacteriophage phi105 immF control region. ACCESSION X06090 M28691 M28692 KEYWORDS immF region; operator; repressor binding site. SOURCE Bacteriophage phi-105 DNA. ORGANISM Bacteriophage phi-105 Viridae; ds-DNA nonenveloped viruses; Siphoviridae. REFERENCE 1 (bases 1 to 664) AUTHORS Van Kaer,L., Van Montagu,M. and Dhaese,P. TITLE Transcriptional control in the EcoRI-F immunity region of Bacillus subtilis Phage phi105: Identification and unusual structure of the operator JOURNAL J. Mol. Biol. 197, 55-67 (1987) STANDARD simple automatic REFERENCE 2 (sites) AUTHORS Van Kaer,L., Van Montagu,M. and Dhaese,P. TITLE Purification and in vitro DNA-binding specificity of the Bacillus subtilis Phage phi105 repressor JOURNAL J. Biol. Chem. 264, 14784-14791 (1989) STANDARD simple staff_review COMMENT Data kindly [1] reviewed (02-SEP-1988) by DHAESE P. FEATURES from to/span description pept 7 < 1 (c) C-phi-105 (gtg start) pept 260 532 ORF 3 pre-msg 60 < 1 (c) C-phi-105 mRNA and introns pre-msg 236 > 664 ORF 3 mRNA and introns signal 95 90 (c) C-phi-105 -35 region signal 72 67 (c) C-phi-105 -10 region signal 201 206 ORF 3 -35 region signal 224 229 ORF 3 -10 region binding 247 254 ribosome binding site (ORF 3) binding 83 96 repressor binding site O-R4 binding 106 119 repressor binding site O-R1 binding 136 149 repressor binding site O-R2 binding 177 189 repressor binding site O-R5 binding 211 224 repressor binding site O-R6 binding 460 473 repressor binding site O-R3 BASE COUNT 237 a 111 c 144 g 172 t ORIGIN 1 tgatcaccta tctcctttac aacacatagt gcctcactgt gccactgtgt cttgtggcat 61 gacacaatta tagtatccga atgtcggaaa tacaatacta aaaaagacgg aaatacaagt 121 attttttagt aaattgacgg aaatacaaga taaatactct ctgaatcttt aaaatgcttg 181 aatttcgtca aatttcgact tttacaaaat gtcgtgaata ccatacaatt tagacatacc 241 ttaacgggag gtgataatca tgctggatgg gaaaaagctt ggggctttaa ttaaggacaa 301 aagaaaagaa aagcacttga aacagacaga aatggcgaag gcactgggta tgtccagaac 361 ttatctctct gatatcgaaa acggcagata tctgccgagt acaaaaacac tttccagaat 421 agcgatttta ataaatctgg atttaaatgt gttaaaaatg acggaaatac aagtagttga 481 ggagggtgga tatgatagag ctgccggcac atgtagaaga caggctttat gagattttta 541 tgaaactatc agttccaagg ttgcttgaga aagaagccct ggagaaagga gagaagccga 601 atgcggaaag aaaaggcgct tgacctcgcg gccttcttcg ctgaatttga acaaatgatg 661 atca // LOCUS PSEORICA 651 bp ds-DNA BCT 15-MAR-1990 DEFINITION P.putida initiation protein (dnaA) and ribosomal protein L34 (rpmH-like) genes, partial cds. ACCESSION M30126 KEYWORDS initiation protein; replication origin; replication protein; ribosomal protein; ribosomal protein L34. SOURCE P.putida (strain KT2440 r- m+) DNA. ORGANISM Pseudomonas putida Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Pseudomonadaceae. REFERENCE 1 (bases 1 to 651) AUTHORS Yee,T.W. and Smith,D.W. TITLE Pseudomonas chromosomal replication origins: A bacterial class distinct from Escherichia coli-type origins JOURNAL Proc. Natl. Acad. Sci. U.S.A. (1989) In press STANDARD full staff_review COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by W.Smith, 21-NOV-1989. FEATURES from to/span description pept 34 < 1 (c) ribosomal protein L34 (gtg start) pept 630 > 651 initiation protein orgrpl 52 536 origin of replication rpt 207 219 13-mer direct repeat (pot.) rpt 231 243 13-mer direct repeat (pot.) rpt 256 268 13-mer direct repeat (pot.) site 275 283 DnaA 9 bp recognition site (pot.) site 289 297 DnaA 9 bp recognition site (pot.) site 388 396 DnaA 9 bp recognition site (pot.) site 406 414 DnaA 9 bp recognition site (pot.) site 421 429 DnaA 9 bp recognition site (pot.) site 468 476 DnaA 9 bp recognition site (pot.) site 481 489 DnaA 9 bp recognition site (pot.) site 552 560 DnaA 9 bp recognition site (pot.) BASE COUNT 158 a 137 c 178 g 178 t ORIGIN 52 bp upstream of SalI site 1 gcttgatggt gcttggttgg aaagtacgtt tcatggcgtg ttacctggtt tgtcgacgac 61 gggccgggat ggcccccttt ttaagagacc ggcgattcta gagaaagcaa gcccataggt 121 caatttccaa ccagtctttc catatagagc atgtgatgga cggtgcctgt tgatcagtgc 181 ccaaggggtg cttgatcgga cacacggatc ggggacaaca tgaaaaaaaa gaagagacat 241 ataaaaagct tttttgaaga acttataact cttaagtgga taaccttctg tggataacct 301 gcgctggccc atgaattacg gggtgtacag agttttacaa ctttgttctg atcccgtgct 361 gcgcttgttc caatcgtgag cgaaagctgt ggatgaaaac acctgttatc cacagcggag 421 ttatcaacag gctaaggggt ggggttgtgc atagccctca tggtcgttta tccacagggc 481 ttattcacag aggcgaaaag ccgttttggt cgataaatgg ctgttttgtc gtggttccta 541 acgtgtccac atgtggataa ctgaacgctc gaccggtaca atggcggttt gtttttgcct 601 catccggctt tcaaactcag gggatatccg tgtcagtgga actttggcag c // LOCUS RABTCAXB 1522 bp ss-mRNA MAM 20-MAY-1987 DEFINITION Rabbit T-cell receptor active alpha-chain (VJC) mRNA, from cell line RL-5, complete cds. ACCESSION M12885 M12735 KEYWORDS T-cell receptor; T-cell receptor alpha-chain; constant region; joining exon; processed gene; variable region. SOURCE Rabbit T-cell line RL-5, cDNA to mRNA, clone pRTA3. ORGANISM Oryctolagus cuniculus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Lagomorpha; Leporidae. REFERENCE 1 (bases 1 to 1522) AUTHORS Marche,P.N. and Kindt,T.J. TITLE Two distinct T-cell receptor alpha-chain transcripts in a rabbit T-cell line: Implications for allelic exclusion in T cells JOURNAL Proc. Natl. Acad. Sci. U.S.A. 83, 2190-2194 (1986) STANDARD full staff_review COMMENT Computer-readable copy of the sequence in [1] was kindly provided by T.J.Kindt, 04-AUG-1986. FEATURES from to/span description pept 164 979 T-cell receptor alpha-chain precursor (VJC) sigp 164 223 T-cell receptor alpha-chain signal peptide matp 224 976 T-cell receptor alpha-chain mRNA < 1 1522 TCA mRNA recomb 505 506 V-region end/J-region start recomb 565 566 J-region end/C-region start BASE COUNT 371 a 441 c 337 g 373 t ORIGIN 51 bp upstream of PstI site. 1 tggtttgaca ggtgcatatt tggttcactc ttcattgttt caattgctgc agggcaggga 61 ggctgtcttt ccaaggaaca aggattttcc aggagttcct acccagaatc caggttctag 121 agctggaaca tcaccctgac atctttcctc cagctcacca gccatgttct ctgcgtcttg 181 ctcagtgact gtggtggtcc tgctgataac tgtgcgacgg acgaatggag cctcagtgac 241 ccagacagag ggacccgtta tcctctccga gggctcatct ctgactctga actgcaacta 301 ccaaaccagc tattcagggt tccttttctg gtacgtccag tatctccatg aaggtcctca 361 gctcctattg caaagcacaa cagaaaacca gagaatggag catcaaggtt ttcacgccac 421 ttttgtcaag aaggacagct ccttccacct gcacaaatcc tcactgcagt tatcagactc 481 agctgtgtac tactgtgctc tgagaagggg agcatcaaac aaactcaccc ttggaacagg 541 aaccctgcta aaagtggaac tgaatatcac agaccctcag cccgcagtct accagctgag 601 aagctcggac tcaaaatcca gcaaactctc catgtgtcta ttaactgatt ttgattctca 661 aaccaatgtg tcacaacgta cagaatccga aacagtcttc acaagccaca ccgtgctgaa 721 catgaagtcc ttagattcca agagctacgg tgccttggcc tggagccaaa acagcagact 781 ttcatgtcaa gatgccttca gtaatcacac attcgtttcc tcctcagatg cgtcctgtga 841 tgccaagttg gtagagaaaa gctttgaaac agacatgaac ctaaactttc aaaacctgtc 901 agtgatcggg ctgcgcctcc tcctcctgaa ggtggctgga tttaacctgc tcatgacact 961 gcggctgtgg tccagctgag gtcagcaaga ctgagagcct cgctccctcc gcaagccgag 1021 gggggctccc caccaccatg gagaggaagg ctcctacccc ttctctgccc tcccctacct 1081 accaatgtgc tggctggatc ctaccagatc tgtgatgaag actgtggaca agcggacaaa 1141 caccgtggcc accccgcgct ccctgtcccc tttctgctgc ttctcactgc ctgaagctca 1201 cagcaagggt tggggtggcc aaagcttctc catgccttga agagactcct tcccctcccc 1261 cagagcccgg ccctactgtc cccgtagatg atggaacaac cccctcccct accccgactc 1321 ccaccatacc ctgtggactc tcttggactc tggctcctga agaatgtttg tatttttttc 1381 aatagtactc ataaagaagc atattgattt tatccaggtg gggggggggg agggcttact 1441 atctagaccc tgccgtgctg tataatctga gccacgttgt cattctgttg cctgtaacat 1501 attaaaaatg atttagaaga cc // LOCUS RATCYPOXM 2401 bp ss-mRNA ROD 04-AUG-1986 DEFINITION Rat NADPH-cytochrome P-450 oxidoreductase mRNA, complete cds. ACCESSION M10068 KEYWORDS oxidoreductase. SOURCE Rat (Sprague-Dawley), cDNA to mRNA, clones pOR-7 and p-OR8. ORGANISM Rattus norvegicus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 2401) AUTHORS Porter,T.D. and Kasper,C.B. TITLE Coding nucleotide sequence of rat NADPH-cytochrome P-450 oxidotreductase cDNA and identification of flavin-binding domains JOURNAL Proc. Natl. Acad. Sci. U.S.A. 82, 973-977 (1985) STANDARD simple staff_review COMMENT This protein is responsible for electron transfer from NADPH to the cytochromes P-450, as well as other microsomal electron acceptors. It contains 1 mol each of FAD and FMN. The FMN domain (positions 233-687; residues 77-228) is homologous with flavodoxins, while the FAD domain (positions 803-2039; residues 267-678) appears homologous with ferredoxin NADP+ reductase, NADH cytochrome b5 reductase, and glutathione reductase (unpublished data, T.D.Porter 07-OCT-1985). A polyadenylation signal is present at position 2384-2389. This sequence and a draft entry were kindly provided on diskette by T.D.Porter (07-OCT-1985). FEATURES from to/span description pept 5 2041 NADPH:ferricytochrome oxidoreductase (EC 1.6.2.4) mRNA < 1 2401 oxred mRNA BASE COUNT 553 a 681 c 685 g 482 t ORIGIN 42 bp upstream of MstII site. 1 caacatgggg gactctcacg aagacaccag tgccaccatg cctgaggccg tggctgaaga 61 agtgtctcta ttcagcacga cggacatggt tctgttttct ctcatcgtgg gggtcctgac 121 ctactggttc atctttagaa agaagaaaga agagataccg gagttcagca agatccaaac 181 aacggcccca cccgtcaaag agagcagctt cgtggaaaag atgaagaaaa cgggaaggaa 241 cattatcgta ttctatggct cccagacggg aaccgctgag gagtttgcca accggctgtc 301 caaggatgcc caccgctacg ggatgcgggg catgtccgca gaccctgaag agtatgactt 361 ggccgacctg agcagcctgc ctgagatcga caagtccctg gtagtcttct gcatggccac 421 atacggagag ggcgacccca cggacaatgc gcaggacttc tatgactggc tgcaggagac 481 tgacgtggac ctcactgggg tcaagtttgc tgtatttggt cttgggaaca agacctatga 541 gcacttcaat gccatgggca agtatgtgga ccagaggctg gagcagcttg gcgcccagcg 601 catctttgag ttgggccttg gtgatgatga cgggaacttg gaagaggatt tcatcacgtg 661 gagggagcag ttctggccag ctgtgtgcga gttctttggg gtagaagcca ctggggagga 721 gtcgagcatt cgccagtatg agctcgtggt ccacgaagac atggacgtag ccaaggtgta 781 cacgggtgag atgggccgtc tgaagagcta cgagaaccag aaacccccct tcgatgctaa 841 gaatccattc ctggctgctg tcaccgccaa ccggaagctg aaccaaggca ctgagcggca 901 tctaatgcac ctggagttgg acatctcaga ctccaagatc aggtatgaat ctggagatca 961 cgtggctgtg tacccagcca atgactcagc cctggtcaac cagattgggg agatcctggg 1021 agctgacctg gatgtcatca tgtctctaaa caatctcgat gaggagtcaa acaagaagca 1081 tccgttcccc tgccccacca cctaccgcac ggccctcacc tactacctgg acatcactaa 1141 cccgccacgc accaatgtgc tctacgaact ggcacagtac gcctcagagc cctcggagca 1201 ggagcacctg cacaagatgg cgtcatcctc aggcgagggc aaggagctgt acctgagctg 1261 ggtggtggaa gcccggaggc acatcctagc catcctccaa gactacccat cactgcggcc 1321 acccatcgac cacctgtgtg agctgctgcc acgcctgcag gcccgatact actccattgc 1381 ctcatcctcc aaggtccacc ccaactccgt gcacatctgt gccgtggccg tggagtacga 1441 agcgaagtct ggccgagtga acaagggggt ggccactagc tggcttcggg ccaaggaacc 1501 agcaggcgag aatggcggcc gcgccctggt acccatgttc gtgcgcaaat ctcagttccg 1561 cttgcctttc aagtccacca cacctgtcat catggtgggc cccggcactg ggattgcccc 1621 tttcatgggc ttcatccagg aacgagcttg gcttcgagag caaggcaagg aggtgggaga 1681 gacgctgcta tactatggct gccggcgctc ggatgaggac tatctgtacc gtgaagagct 1741 agcccgcttc cacaaggacg gtgccctcac gcagcttaat gtggcctttt cccgggagca 1801 ggcccacaag gtctatgtcc agcaccttct gaagagagac agggaacacc tgtggaagct 1861 gatccacgag ggcggtgccc acatctatgt gtgcggggat gctcgaaata tggccaaaga 1921 tgtgcaaaac acattctatg acattgtggc tgagttcggg cccatggagc acacccaggc 1981 tgtggactat gttaagaagc tgatgaccaa gggccgctac tcactagatg tgtggagcta 2041 ggagctacca ccctcccacc cctcgctccc tgtaatcacc taacttctgc cgacctccac 2101 ctctggtggt tcctgcctgg cctggacaca gggaggccca gggactgact cctcctggcc 2161 tgagtggtgc cctcctgggc ccctaggcag agcccggtcc attgtatcag gcagcccagc 2221 cccagggcac atggcaagag ggactggacc cacctttggg tgatgggtgc cttaggtcct 2281 ctgcagctgt acagaagggg ctcttctctc cacagagctg gggtgcagcc cccacacgtg 2341 attttgaatg agtgtaaata attttaaata acctggccct tggaataaag ttgttttcag 2401 t // LOCUS RATHPAB 960 bp ss-mRNA ROD 15-MAR-1985 DEFINITION Rat haptoglobin mRNA, partial alpha-, complete beta-subunit and 3' flank. ACCESSION K01933 KEYWORDS glycoprotein; haptoglobin. SOURCE Rat (Sprague-Dawley) liver, cDNA to mRNA, clone pA39. ORGANISM Rattus norvegicus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 960) AUTHORS Goldstein,L.A. and Heath,E.C. TITLE Nucleotide sequence of rat haptoglobin cDNA: Characterization of the alpha-beta-subunit junction region of prohaptoglobin JOURNAL J. Biol. Chem. 259, 9212-9217 (1984) STANDARD full staff_review COMMENT Rat haptoglobin's primary translation product (preprohaptoglobin) contains both the alpha- and beta-subunit regions. Preprohaptoglobin is co-translationally N-glycosylated (prohaptoglobin) and dimerized prohaptoglobin is post-translationally cleaved to produce the native tetrameric structure. A single arginine residue is found at the alpha-beta-subunit junction regions of prohaptoglobin. The sequence homology of these regions with serine protease precursors (75% for the alpha-chain, 90% for the beta-chain) suggests that post-translational processing of prohaptoglobin involves cleavage of the Arg-Ile bond and extraction of the Arg residue, and that both subunits contribute to haptoglobin's biological activity. FEATURES from to/span description pept < 1 821 preprohaptoglobin (aa 58 at 3) matp < 1 80 haptoglobin alpha-subunit, mature peptide matp 84 818 haptoglobin beta-subunit, mature peptide mRNA < 1 960 Hp mRNA BASE COUNT 244 a 231 c 279 g 206 t ORIGIN 136 bp upstream of Sau3A site. 1 tgaacccagc tgctggcgat aaactcccca agtgtgaggc agtgtgtggg aagcccaagc 61 atcctgtgga ccaggtacag cgcatcatcg gtggttccat ggacgccaaa ggcagctttc 121 cttggcaggc caagatgatc tccagacatg gactcaccac tggggccaca ctgatcagtg 181 accagtggct gctgaccact gcccaaaacc tcttcctgaa tcacagtgag aatgcgacag 241 ccaaggacat tgcccctacc ttaacactct atgtggggaa aaaccagctg gtggagattg 301 agaaggtagt tctccacccc gagcgctctg tggtggatat cgggctgatc aagctcaaac 361 agaaagtgct tgtcactgag aaagtcatgc ctatctgcct gccttccaaa gactacgtag 421 cgccaggccg catgggctat gtgtccggtt gggggcggaa tgtcaacttt agatttactg 481 aacgtctcaa gtatgtcatg ctgcctgtgg ctgaccagga gaagtgtgag ctgcactatg 541 agaaaagcac agtgcctgag aagaaaggcg ctgtaactcc tgttggggta cagcccatct 601 tgaataagca taccttctgt gctggcctta ccaagtatga ggaagacact tgctatggtg 661 acgctggcag tgcctttgcc gtccatgaca cggaggagga cacctggtat gcagctggga 721 tcctgagctt tgacaagagt tgtgccgtag ctgagtatgg tgtgtacgtg agggcaactg 781 atctgaagga ctgggtccag gaaacaatgg ccaagaacta gttcagggct gactagaggg 841 ctgcacacag tggggcaggg caattcaccc tggaagagga agtagaaggg ttggggacat 901 aatctgaggg ctgctagccc tgcattgctc agtcaataat aaaaaacgag ctttggaccc // LOCUS RATRGE4 147 bp ds-DNA ROD 07-NOV-1984 DEFINITION rat 18s rrna gene. ACCESSION K01590 KEYWORDS 18S ribosomal RNA; ribosomal RNA. SOURCE rat dna, subclone pb4-5.1 of clone lambda chr-b4. ORGANISM Rattus norvegicus Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Muridae; Murinae. REFERENCE 1 (bases 1 to 147) AUTHORS Cassidy,B.G., Subrahmanyam,C.S. and Rothblum,L.I. TITLE the nucleotide sequence of the 5' region of rat 18s rdna and adjoining spacer JOURNAL Biochem. Biophys. Res. Commun. 107, 1571-1576 (1982) STANDARD simple staff_review COMMENT [1] proposes a secondary structure model depicting possible interactions between the 5' and 3' termini of 18s rrna and the adjacent transcribed spacers of rat and xenopus preribosomal rna. the model suggests a possible configuration for the processing of pre-18s rrna. FEATURES from to/span description rRNA 86 > 147 18s rrna BASE COUNT 17 a 64 c 23 g 43 t ORIGIN 1 cccctaccct ccctccctcc ctcctctcgc tctctctctc tctctctccc gcctcccgcc 61 gcgtctcggc ttcgctcgcg ctccttacct ggttgatcct gccgatagca tatgcttgtc 121 tcaaagatta agccatgcat gtctaag // LOCUS SNDVL 6800 bp ss-RNA VRL 31-AUG-1987 DEFINITION Sendai virus (5' end of the viral genome) L gene, complete cds. ACCESSION M14887 KEYWORDS protein L. SOURCE Sendai virus (Enders strain) RNA. ORGANISM Parainfluenza virus type 1 Viridae; ss-RNA enveloped viruses; Negative strand RNA viruses; Paramyxoviridae; Paramyxovirus. REFERENCE 1 (bases 1 to 6800) AUTHORS Morgan,E.M. and Rakestraw,K.M. TITLE Sequence of the Sendai virus L gene: Open reading frames upstream of the main coding region suggest that the gene may be polycistronic JOURNAL Virology 154, 31-40 (1986) STANDARD full staff_review REFERENCE 2 (bases 1 to 6800; revises [1]) AUTHORS Morgan,E.M. JOURNAL Unpublished (1987) St Jude Children's Research Hosp, Memphis, TN STANDARD full staff_review COMMENT Computer-readable sequence for [1] kindly provided by E.M.Morgan, 11-MAY-1987. FEATURES from to/span description pept 569 6715 L protein mRNA 1 6800 L mRNA ORF 29 202 ORF1 cds ORF 33 404 ORF2 cds revision 6754 6756 ggt in [2]; gt in [1] BASE COUNT 2059 a 1374 c 1628 g 1739 t ORIGIN Unreported. 1 agggtgaatg ggaagcttgc cataggtcat ggatgggcag gagtcctccc aaaacccttc 61 tgacatactc tatccagaat gccaccctga actctcccat agtcaggggg aagatagcac 121 agttgcacgt cttgttagat gtgaaccagc cctacagact gaaggacgac agcataataa 181 atattacaaa gcacaaaatt aggaacggag gattgtcccc tcgtcagatt aagatcaggt 241 ctctgggtaa ggctcttcaa cgcacaataa aggatttaga ccgatacacg tttgaaccgt 301 acccaaccta ctctcaggaa ttacttaggc ttgatatacc agagatatgt gacaaaatcc 361 gatccgtctt cgcgtctcgg atcgctgacc agggagttat ctagtgggtt ccaggatctt 421 tggttgaata tcttcaagca actaggcaat atagaaggaa gagaggggta cgatccgttg 481 caggatatca gcaccatccc ggagataact gataagtaca gcaggaatag atggtatagg 541 ccattcctaa cttggttcag catcaaatat gacatgcggt ggatgcagaa gaccagacca 601 gggggacccc atcgatacct ctaattcaca taacctccta gaatgcaaat catacactct 661 agtaacatac ggagatcttg tcatgatact gaacaagttg acattgacag ggtatatcct 721 aacccctgag ctggtcttga tgtattgtga tgttgtagag ggaaggtgga atatgtctgc 781 tgcagggcat ctagataaga ggtccattgg gataacaagc aaaggtgagg aattatggga 841 actagtggat tccctcttct caagtcttgg agaggaaata tacaatgtca tcgcactatt 901 ggagccccta tcacttgctc tcatacaact aaatgatcct gttatacctc tacgtggggc 961 atttatgagg catgtgttga cagagctaca gactgtttta acaagcagag acgtgtacac 1021 agatgctgaa gcagacacta ttgtggagtc gttactcgcc attttccatg gaacctctat 1081 tgatgagaaa gcagagatct tttccttctt taggacattt ggccacccca gcttagaggc 1141 tgtcactgcc gccgacaagg taagggccca tatgtatgca caaaaggcaa taaagcttaa 1201 gaccctatac gagtgtcatg cagttttttg cactatcatc ataaatgggt atagagagag 1261 gcatggcgga cagtggcccc cctgtgactt ccctgatcac gtgtgtctag aactaaggaa 1321 cgctcaaggg tccaatacgg caatctctta tgaatgtgct gtagacaact atacaagttt 1381 cataggcttc aagtttcgga agtttataga accacaacta gatgaagatc tcacaatata 1441 tatgaaagac aaagcactat cccccaggaa ggaggcatgg gactctgtat acccggatag 1501 taatctgtac tataaagccc cagagtctga agagacccgg cggcttattg aagtgttcat 1561 aaatgatgag aatttcaacc cagaagaaat tatcaattat gtggagtcag gagattggtt 1621 gaaagacgag aagttcaaca tctcgtacag tctcaaagag aaagagatca agcaagaggg 1681 tcgtctattc gcaaaaatga cttataagat gcgagccgta caggtgctgg cagagacact 1741 actggctaaa ggaataggag agctgttcag cgaaaatggg atggttaaag gagagataga 1801 cctacttaaa agattgacta ctctttctgt ctcaggcgtc cccaggactg attcagtgta 1861 caataactct aaatcatcag agaagagaaa cgaaggcatg aaaaagaaga actctggggg 1921 gtactgggac gaaaagaaga ggtccaggca tgaattcaag gcaacagatt catcaacaga 1981 cggctatgaa acgttaagtt gcttcctcac aacagacctc aagaaatact gcttaaactg 2041 gagatttgag agtactgcat tgtttggtca gagatgcaac gagatatttg gcttcaagac 2101 cttctttaac tggatgcatc cgatccttga aaggtgtaca atatatgttg gagatcctta 2161 ctgtccagtc gccgaccgga tgcatcgaca actccaggat catgcagact ctggcatttt 2221 catacataat cctagggggg gcatagaagg ttactgccag aagctgtgga ccttaatctc 2281 aatcagtgca atccacctag cagctgtgag agtgggtgtc agggtctctg caatggttca 2341 gggtgacaat caagctatag ccgtgacatc aagagtacct gtagctcaga cttacaagca 2401 gaagaaaaat catgtctatg aggaaaccac caaatatttc ggtgctctaa gacacgtcat 2461 gtttgatgta gggcacgagc taaaattgaa cgagaccatc attagtagca agatgtttgt 2521 ctatagtaaa aggatatact atgatgggaa gattttacca cagtgcctga aagccttgac 2581 caggtgtgta ttctggtccg agacactggt agatgaaaac agatctgctt gttcgaacat 2641 ctcaacatcc atagcaaaag ctatcgaaaa tgggtattct cctatactag gctactgcat 2701 tgcgttgtat aagacctgtc agcaggtgtg catatcacta gggatgacta taaatccaac 2761 tatcagcccg accgtaagag atcaatactt taagggtaag aattggctga gatgtgcagt 2821 gttgattcca gcaaatgttg gaggattcaa ctacatgtct acatctagat gctttgttag 2881 aaatattgga gaccccgcag tagcagccct agctgatctc aaaagattca tcagagcgga 2941 tctgttagac aagcaggtat tatacagggt catgaatcaa gaacccggtg actctagctt 3001 tctagattgg gctccagacc cttattcgtg taacctcccg cattctcaga gtataactac 3061 gattataaag aatatcactg ctagatctgt gctgcaggaa tccccgaatc ctctactgtc 3121 tggtctcttc accgagacta gtggcgaaga ggatctcaac ctggcctcgt tccttatgga 3181 ccggaaagtc atcctgccga gagtggctca tgagatcctg ggtaattcct taactggagt 3241 tagggaggcg attgcaggga tgcttgatac gaccaagtct ctagtgagag ccagcgttag 3301 gaaaggagga ttatcatatg ggatattgag gaggcttgtc aattatgatc tattgcagta 3361 cgagacactg actagaactc tcaggaaacc ggtgaaagac aacatcgaat atgagtatat 3421 gtgttcagtt gagctagctg tcggtctaag gcagaaaatg tggatccacc tgacttacgg 3481 gagacccata catgggttag aaacaccaga ccctttagag ctcttgaggg gaacatttat 3541 cgaaggttca gaggtgtgca agctttgcag gtctgaagga gcagacccca tctatacatg 3601 gttctatctt cctgacaata tagacctgga cacgcttaca aacggaagtc cggctataag 3661 aatcccctat tttggatcag ccactgatga aaggtcggaa gcccaactcg ggtatgtaag 3721 aaatctaagc aaacccgcaa aggcggccat ccggatagct atggtgtata cgtgggccta 3781 cgggactgat gagatatcgt ggatggaagc cgctcttata gcccaaacaa gagctaatct 3841 gagcttagag aatctaaagc tgctgactcc tgtttcaacc cccactaatc tatctcatag 3901 gttgaaagat acggcaaccc agatgaagtt ctctagtgca acactagtcc gtgcaagtcg 3961 gttcataaca atatcaaatg ataacatggc actcaaagaa gcaggggagt cgaaggatac 4021 taatctcgtg tatcagcaga ttatgctaac tgggctaagc ttgttcgagt tcaatatgag 4081 atataagaaa ggttccttag ggaagccact gatattgcac ttacatctta ataacgggtg 4141 ctgtataatg gagtccccac aggaggcgaa tatcccccca aggtccacat tagatttaga 4201 gattacacaa gagaacaata aattgatcta tgatcctgat ccactcaagg atgtggacct 4261 tgagctattt agcaaggtca gagatgttgt acatacagtt gacatgactt attggtcaga 4321 tgatgaagtt atcagagcaa ccagcatctg tactgcaatg acgatagctg atacaatgtc 4381 tcaattagat agagacaact taaaagagat gatcgcacta gtaaatgacg atgatgtcaa 4441 cagcttgatt actgagttta tggtgattga tgttccttta ttttgctcaa cgttcggggg 4501 tattctagtc aatcagtttg catactcact ctacggctta aacatcagag gaagggaaga 4561 aatatgggga catgtagtcc ggattcttaa agatacctcc cacgcagttc taaaagtctt 4621 atctaatgct ctatcccatc ccaaaatctt caaacgattc tggaatgcag gtgtcgtgga 4681 acctgtgtat gggcctaacc tctcaaatca ggataagata ctcttggccc tctctgtctg 4741 tgaatattct gtggatctat tcatgcacga ctggcaaggg ggtgtaccgc ttgagatctt 4801 tatctgtgac aatgacccag atgtggccga catgaggagg tcctctttct tggcaagaca 4861 tcttgcatac ctatgcagct tggcagagat atctagggat gggccaagat tagaatcaat 4921 gaactctcta gagaggctcg agtcactaaa gagttacctg gaactcacat ttcttgatga 4981 cccggtactg aggtacagtc agttgactgg cctagtcatc aaagtattcc catctacttt 5041 gacctatatc cggaagtcat ctataaaagt gttaaggaca agaggtatag gagtccctga 5101 agtcttagaa gattgggatc ccgaggcaga taatgcactg ttagatggta tcgcggcaga 5161 aatacaacag aatattcctt tgggacatca gactagagcc cctttttggg ggttgagagt 5221 atccaagtca caggtactgc gcctccgggg gtacaaggag atcacaagag gtgagatagg 5281 cagatcaggt gttggtctga cgttaccatt cgatggaaga tacctatctc accagctgag 5341 gctctttggc atcaacagta ctagctgctt gaaagcactt gaacttacct acctattgag 5401 ccccttagtt gacaaggata aagataggct atatttaggg gaaggagctg gggccatgct 5461 ttcctgttat gacgctactc ttggcccatg catcaactat tataactcag gggtatactc 5521 ttgtgatgtc aatgggcaga gagagttaaa tatatatcct gctgaggtgg cactagtggg 5581 aaagaaatta aacaatgtta ctagtctggg tcaaagagtt aaagtgttat tcaacgggaa 5641 tcctggctcg acatggattg ggaatgatga gtgtgaggct ttgatttgga atgaattaca 5701 gaatagctcg ataggcctag tccactgtga catggaggga ggagatcata aggatgatca 5761 agttgtactg catgagcatt acagtgtaat ccggatcgcg tatctggtgg gggatcgaga 5821 tgttgtgctt ataagcaaga ttgctcccag gctgggcacg gattggacca ggcagctcag 5881 cctatatctg agatactggg acgaggttaa cctaatagtg cttaaaacat ctaaccctgc 5941 ttccacagag atgtatcttc tatcgaggca ccccaaatct gacattatag aggacagcaa 6001 gacagtgtta gctagtctcc tccctttgtc aaaagaagat agcatcaaga tagaaaagtg 6061 gatcttaata gagaaggcaa aggctcacga atgggttact cgggaattga gagaaggaag 6121 ctcttcatca gggatgctta gaccttacca tcaagcactg cagacgtttg gctttgaacc 6181 aaacttgtat aaattgagca gagatttctt gtccaccatg aacatagctg atacacacaa 6241 ctgtatgata gctttcaaca gggttttgaa ggatacaatc ttcgaatggg ctagaataac 6301 tgagtcagat aaaaggctta aactaactgg taagtatgac ctgtatcctg tgagagattc 6361 aggcaagttg aagacaattt ctagaagact tgtgctatct tggatatctt tatctatgtc 6421 cacaagattg gtaactgggt cattccctga ccagaagttt gaagcaagac ttcaattggg 6481 aatagtttca ttatcatccc gtgagatcag gaacctgagg gttatcacaa aaactttatt 6541 agacaggttt gaggatatta tacatagtat aacgtataga ttcctcacca aagaaataaa 6601 gattttgatg aagattttag gggcagtcaa gatgttcggg gccaggcaaa atgaatacac 6661 gaccgtgatt gatgatggat cactgggtga tatcgagcca tatgacagct cgtaataatt 6721 agtccctatc gtgcagaacg atcgaagctc cgcggtacct ggaagtcttg gacttatcca 6781 tatgacaata gtaagaaaaa // LOCUS TRN10TETR 4029 bp ds-DNA BCT 04-AUG-1986 DEFINITION Transposon Tn10 tetracycline resistance and repressor genes tetA and tetR, complete cds. ACCESSION J01830 K00615 K01493 X00694 KEYWORDS drug resistance; insertion sequence; tetracycline repressor protein; tetracycline resistance; tetracycline resistance protein; transposon; unidentified reading frame. SOURCE Tn10 (transposon Tn10) DNA, clone pRT29 [1]; clone lambda-RT301 [2]; clone pRT61 [3],[5]; clone pBT107 [4]; clone pBT402 [6]. ORGANISM Transposon Tn10 Prokaryota; Bacteria. REFERENCE 1 (bases 1458 to 1659) AUTHORS Bertrand,K.P., Postle,K., Wray,L.V.Jr. and Reznikoff,W.S. TITLE Overlapping divergent promoters control expression of Tn10 tetracycline resistance JOURNAL Gene 23, 149-156 (1983) STANDARD simple staff_review REFERENCE 2 (bases 1518 to 1619) AUTHORS Wray,L.V.Jr. and Reznikoff,W.S. TITLE Identification of repressor binding sites controlling expression of tetracycline resistance encoded by Tn10 JOURNAL J. Bacteriol. 156, 1188-1191 (1983) STANDARD simple staff_review REFERENCE 3 (bases 1539 to 3077) AUTHORS Hillen,W. and Schollmeier,K. TITLE Nucleotide sequence of the Tn10 encoded tetracycline resistance gene JOURNAL Nucleic Acids Res. 11, 525-539 (1983) STANDARD simple staff_review REFERENCE 4 (bases 1548 to 3077) AUTHORS Nguyen,T.T., Postle,K. and Bertrand,K.P. TITLE Sequence homology between the tetracycline-resistance determinants of Tn10 and pBR322 JOURNAL Gene 25, 83-92 (1983) STANDARD simple staff_review REFERENCE 5 (bases 3072 to 4029) AUTHORS Schollmeier,K. and Hillen,W. TITLE Transposon Tn10 contains two structural genes with opposite polarity between tetA and IS10R JOURNAL J. Bacteriol. 160, 499-503 (1984) STANDARD simple staff_review REFERENCE 6 (bases 1 to 1550) AUTHORS Postle,K., Nguyen,T.T. and Bertrand,K.P. TITLE Nucleotide sequence of the repressor gene of the TN10 tetracycline resistance determinant JOURNAL Nucleic Acids Res. 12, 4849-4863 (1984) STANDARD simple staff_review COMMENT Sequence for [3] and [5] in computer readable form kindly provided by W.Hillen, 29-MAR-1985. Open reading frames in the 3' flank of the tetR gene are located at positions 585-442 ("gtg" start codon) and 815-681 [6]. The genes for tetA and tetR are transcribed in opposite directions from promoters in the regulatory region (tetA -35 and -10 regions are located at positions 1546-1551 and 1570-1575 respectively and for tetR at 1579-1574 and 1556-1551) [1]. Their ribosome binding sites are found at 1595-1599 and 1537-1533. The physical overlap between the tetA and tetR promoters may mean that RNA polymerase interaction with these promoters is mutually exclusive [1]. The expression of tetA and tetR is regulated by tetracycline and the tetR repressor [1]. If the repressor binding site at positions 1561-1569 is deleted, a higher level of derepression is noted than is the case when the second repressor binding site (1591-1599) is deleted [2]. ORFL and ORFR also are transcribed in opposite directions and have only 18 bp between their mRNA start sites [5]. -35 and -10 regions for ORFR are located at positions 3499-3504 and 3504-3572 and for ORFL at 3559-3554 and 3539-3534 respectively [5]. The start of ORFL could be the "gtg" codon at position 3387-3385 [5]. Though no function for the two ORFs is presently known, they may be tetracycline-inducible proteins. FEATURES from to/span description pept 1526 903 (c) tetracycline repressor protein (tetR) pept 1608 2813 tetracycline resistance protein (tetA) pept 3519 2926 (c) unidentified reading frame L (ORFL) (putative) pept 3607 4023 unidentified reading frame R (ORFR) mRNA 1581 > 2813 tetA mRNA mRNA 3532 > 4023 ORFR mRNA (major alt.) [3],[5] mRNA 3568 > 4023 ORFR mRNA (minor alt.) [5] mRNA 1544 < 903 (c) tetR repressor mRNA mRNA 3549 < 2926 (c) ORFL mRNA [5] binding 1561 1569 repressor binding site A binding 1591 1599 repressor binding site B conflict 2449 2449 a in [3]; g in [4] conflict 2509 2509 a in [3]; t in [4] conflict 2595 2595 g in [3]; c in [4] conflict 2667 2667 a in [3]; g in [4] conflict 3065 3066 ct in [3]; tc in [4] BASE COUNT 1121 a 738 c 844 g 1326 t ORIGIN 3 bp upstream of HincII site. 1 ctaaaatagc gctagagcaa gctggactgt tagtacttgc gatccggaca ccattaattc 61 gtaagttaaa acaggaaaaa ccgggggaac ttggcgaaat agcacgagta ttggcggaga 121 ataacattaa tattttagtg caatacagtg accatgctaa ccaactgata ttaataacgg 181 acaatgatag tatggctgca tctgttacgc tcccttgggc aataaagtga acttgcgatg 241 gctaatttaa tacgaaaaga ggttaccttt gagtcctcaa tagccgcgat agggggctca 301 tgtctgacat ttcacgagtt aaaatactca gtgctttgat ggatgggcga gcttggacgg 361 ccactgagct aagttctgtg gcgaatatat cagcttcaac ggcgagcagt catttatcta 421 aattattaga ttgccagcta atcacagtag tagctcaagg caagcatcgt tattttcggc 481 tagcaggaaa agatattgct gaattgatgg aaagtatgat ggggatctcc ttaaaccatg 541 gcgtacatgc caaagtttcc acgccagtgc atttacgaaa agcacgtact tgctatatga 601 tcatttagct ggcgaagttg ccgttaagat ctatgattcc ctttgtcaac agcaatggat 661 cactgaaaat ggttcaatga tcacattaag tggtattcaa tattttcatg aaatgggaat 721 tgacgttcct tccaaacatt cacgtaaaat ctgttgtgcg tgtttagatt ggagtgaacg 781 ccgtttccat ttaggtgggt acgttggagc cgcattattt tcgctttatg aatctaaagg 841 gtggttaact cgacatcttg gttaccgtga agttaccatc acggaaaaag gttatgctgc 901 ttttaagacc cactttcaca tttaagttgt ttttctaatc cgcatatgat caattcaagg 961 ccgaataaga aggctggctc tgcaccttgg tgatcaaata attcgatagc ttgtcgtaat 1021 aatggcggca tactatcagt agtaggtgtt tccctttctt ctttagcgac ttgatgctct 1081 tgatcttcca atacgcaacc taaagtaaaa tgccccacag cgctgagtgc atataatgca 1141 ttctctagtg aaaaaccttg ttggcataaa aaggctaatt gattttcgag agtttcatac 1201 tgtttttctg taggccgtgt acctaaatgt acttttgctc catcgcgatg acttagtaaa 1261 gcacatctaa aacttttagc gttattacgt aaaaaatctt gccagctttc cccttctaaa 1321 gggcaaaagt gagtatggtg cctatctaac atctcaatgg ctaaggcgtc gagcaaagcc 1381 cgcttatttt ttacatgcca atacaatgta ggctgctcta cacctagctt ctgggcgagt 1441 ttacgggttg ttaaaccttc gattccgacc tcattaagca gctctaatgc gctgttaatc 1501 actttacttt tatctaatct agacatcatt aattcctaat ttttgttgac actctatcat 1561 tgatagagtt attttaccac tccctatcag tgatagagaa aagtgaaatg aatagttcga 1621 caaagatcgc attggtaatt acgttactcg atgccatggg gattggcctt atcatgccag 1681 tcttgccaac gttattacgt gaatttattg cttcggaaga tatcgctaac cactttggcg 1741 tattgcttgc actttatgcg ttaatgcagg ttatctttgc tccttggctt ggaaaaatgt 1801 ctgaccgatt tggtcggcgc ccagtgctgt tgttgtcatt aataggcgca tcgctggatt 1861 acttattgct ggctttttca agtgcgcttt ggatgctgta tttaggccgt ttgctttcag 1921 ggatcacagg agctactggg gctgtcgcgg catcggtcat tgccgatacc acctcagctt 1981 ctcaacgcgt gaagtggttc ggttggttag gggcaagttt tgggcttggt ttaatagcgg 2041 ggcctattat tggtggtttt gcaggagaga tttcaccgca tagtcccttt tttatcgctg 2101 cgttgctaaa tattgtcact ttccttgtgg ttatgttttg gttccgtgaa accaaaaata 2161 cacgtgataa tacagatacc gaagtagggg ttgagacgca atcgaattcg gtatacatca 2221 ctttatttaa aacgatgccc attttgttga ttatttattt ttcagcgcaa ttgataggcc 2281 aaattcccgc aacggtgtgg gtgctattta ccgaaaatcg ttttggatgg aatagcatga 2341 tggttggctt ttcattagcg ggtcttggtc ttttacactc agtattccaa gcctttgtgg 2401 caggaagaat agccactaaa tggggcgaaa aaacggcagt actgctcgaa tttattgcag 2461 atagtagtgc atttgccttt ttagcgttta tatctgaagg ttggttagat ttccctgttt 2521 taattttatt ggctggtggt gggatcgctt tacctgcatt acagggagtg atgtctatcc 2581 aaacaaagag tcatgagcaa ggtgctttac agggattatt ggtgagcctt accaatgcaa 2641 ccggtgttat tggcccatta ctgtttactg ttatttataa tcattcacta ccaatttggg 2701 atggctggat ttggattatt ggtttagcgt tttactgtat tattatcctg ctatcgatga 2761 ccttcatgtt aacccctcaa gctcagggga gtaaacagga gacaagtgct tagttatttc 2821 gtcaccaaat gatgttattc cgcgaaatat aatgaccctc ttgataaccc aagagggcat 2881 tttttacgat aaagaagatt tagcttcaaa taaaacctat ctattttatt tatctttcaa 2941 gctcaataaa aagccgcggt aaatagcaat aaattggcct tttttatcgg caagctcttt 3001 taggtttttc gcatgtattg cgatatgcat aaaccagcca ttgagtaagt ttttaagcac 3061 atcactatca taagctttaa gttggttctc ttggatcaat ttgctgacaa tggcgtttac 3121 cttaccagta atgtattcaa ggctaatttt ttcaagttca ttccaaccaa tgataggcat 3181 cacttcttgg atagggataa ggtttttatt attatcaata atataatcaa gataatgttc 3241 aaatatactt tctaaggcag accaaccatt tgttaaatca gtttttgttg tgatgtaggc 3301 atcaatcata attaattgct gcttataaca ggcactgagt aattgttttt tatttttaaa 3361 gtgatgataa aaggcacctt tggtcaccaa cgcttttccc gagatctcat ctattgaaac 3421 agcttgatag cctttttcaa caaacaatat tcgtgctgag ttaaccagtg attgataggt 3481 actcttaaaa ttttcttgtt gatgattttt attttccatg atagatttaa aataacatac 3541 cgtcagtatg tttatggtat catgatgatg tggtcgtgac aatcttaaga acatttaggt 3601 tattttatgt atattgaaca gcattctcgc tatcaaaata aagctaataa catccaatta 3661 gaatatgatg atagacagtt tcatacaacg gttatcaaag atgttctatt atggattgaa 3721 cataatttag atcagtcttt actgcttgat gatgtggcga ataaagcggg ttataccaag 3781 tggtattttc agcggctgtt caaaaaagta acaggggtca cactggctag ctatattcgt 3841 gctcgtcgtt tgacgaaagc ggctgttgag ttgaggttga cgaaaaaaac tatccttgag 3901 atcgcattaa aatatcaatt tgattcccaa caatctttta cacgtcgatt taagtacatt 3961 tttaaggtta caccaagtta ttatcggcgt aataaattat gggaattgga ggcaatgcac 4021 tgagagatc // LOCUS VSVNS 856 bp ss-RNA VRL 04-AUG-1986 DEFINITION Vesicular stomatitis virus (New Jersey) NS protein mRNA. ACCESSION K03387 KEYWORDS NS gene; NS protein; nonstructural protein. SOURCE Vesicular stomatitis virus (strain New Jersey), cDNA to NS mRNA. ORGANISM Vesicular stomatitis virus Viridae; ss-RNA enveloped viruses; Negative strand RNA viruses; Rhabdoviridae; Vesiculovirus. REFERENCE 1 (bases 1 to 856) AUTHORS Gill,D.S. and Banerjee,A.K. TITLE Vesicular stomatitis virus NS proteins: Structural similarity without extensive sequence homology JOURNAL J. Virol. 55, 60-66 (1985) STANDARD simple staff_review COMMENT A printed copy of this sequence was kindly provided by D.S.Gill (12-NOV-1985). FEATURES from to/span description pept 11 835 Non-structural protein mRNA 1 856 NS mRNA [1] BASE COUNT 283 a 162 c 185 g 226 t ORIGIN 132 bp upstream of TaqI site. 1 aacagatatc atggacagtg ttgataggct caagacttac ttagccactt atgataattt 61 ggattctgcc ttgcaggatg ccaatgaatc tgaggaaaga cgagaggata aatatctcca 121 agacctcttc atcgaagatc aaggagataa accaactccg tcatattatc aggaagaaga 181 atcgtcagat tcagatactg attataatgc tgaacatctt acgatgctgt caccggatga 241 aagaatagac aagtgggaag aagatttgcc tgaattagaa aagattgatg atgatatacc 301 ggtgaccttt tctgattgga cacagcctgt aatgaaggaa aatgggggag agaaatcatt 361 gtctctgttc cctccagtcg ggttaacaaa gattcaaaca gaacaatgga aaaaaaccat 421 tgaggcggtt tgtgagagtt caaaatattg gaatttatca gaatgccaaa ttcttaactt 481 ggaagacagc ctcactctca aaggccgatt gatgactcct gattgtagtt cttcagtaaa 541 atctcaaaat tctgtccgga ggtcagaacc tctctactcc tctcattctc caggtccccc 601 actcaaggta tcagagtcca tcaatttatg ggatttaaag tccactgaag tacaattgat 661 ctccaagaga gccggagtta aggacatgac agtcaaattg acagacttct ttggaagtga 721 ggaagagtat tattcagtat gcccagaagg ggcgccagac ttgatgggag ctatcatcat 781 gggactgaag tacaagaaac tcttcaatca ggcaagaatg aaatatcgtc tttaattcct 841 tttcatgatc aatatg // LOCUS XELPYLA 411 bp ss-mRNA VRT 20-MAY-1987 DEFINITION X.laevis PYLa precursor mRNA, complete cds. ACCESSION M12498 KEYWORDS PYLa. SOURCE X.laevis skin, cDNA to mRNA, clone pUF81 [1]. ORGANISM Xenopus laevis Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Amphibia; Lissamphibia; Anura; Archeobatrachia; Pipoidea; Pipidae. REFERENCE 1 (bases 109 to 411) AUTHORS Hoffmann,W., Richter,K. and Kreil,G. TITLE A novel peptide designated PYLa and its precursor as predicted from cloned mRNA of Xenpous laevis skin JOURNAL EMBO J. 2, 711-714 (1983) STANDARD full staff_review REFERENCE 2 (bases 1 to 411) AUTHORS Hoffmann,W., Richter,K. and Kreil,G. TITLE ; JOURNAL (in) Habener,J.F. (Ed.); Genes encoding hormones and regulatory peptides: Ch. 16; Humana Press (1986) In press STANDARD full staff_review COMMENT Draft entry and computer-readable sequence for [1] were kindly provided by G.Kreil, 05-JUN-1986. FEATURES from to/span description pept 65 259 PYLa precursor sigp 65 124 PYLa signal peptide matp 125 256 PYLa mRNA < 1 411 pyla mRNA BASE COUNT 134 a 80 c 88 g 109 t ORIGIN Unreported. 1 agcacaacaa ttgtacggag cactttgcta cttctagttt tgaagagcta ttacatttgg 61 aaggatgtac aaacaaattt tcctctgtct gatcattgca gcactctgtg caaccataat 121 ggcagaggct tcagcattag cagatgcaga tgatgacgat gacaagcgtt acgtccgagg 181 aatggcatct aaagctggag caattgcggg aaaaattgct aaagttgctc taaaggctct 241 tggacgtcgt gactcgtagg acttcagcgg tctcaatgga atgtaaatga caacacttgg 301 ctggacagat tttgaacatt gatgtattga agaataacat aaaccctcca caatccctca 361 aacctataaa taaatctgtt ggacaggaaa taaaatgacc tatatgcata t // LOCUS YSCCOX5AA 1641 bp ds-DNA PLN 15-SEP-1989 DEFINITION Yeast (S.cerevisiae) cytochrome oxidase subunit 5 (COX5) gene, complete cds. ACCESSION M23780 KEYWORDS cytochrome oxidase. SOURCE Yeast (S.cerevisiae) DNA. ORGANISM Saccharomyces cerevisiae Eukaryota; Plantae; Thallobionta; Eumycota; Hemiascomycetes; Endomycetales; Saccharomycetaceae. REFERENCE 1 (bases 1 to 1641; missing data project) AUTHORS Seraphin,B., Simon,M. and Faye,G. TITLE Primary structure of a gene for subunit V of the cytochrome c oxidase from Saccharomyces cerevisiae JOURNAL Curr. Genet. 9, 435-439 (1985) STANDARD simple staff_entry FEATURES from to/span description pept 544 1005 cytochrome oxidase subunit 5 precursor sigp 544 600 cytochrome oxidase subunit 5 signal peptide matp 601 1002 cytochrome oxidase subunit 5 ORF 3 230 putative protein mRNA 508 > 1002 COX5 mRNA (minor alt.) mRNA 509 > 1002 COX5 mRNA (minor alt.) mRNA 510 > 1002 COX5 mRNA (minor alt.) mRNA 513 > 1002 COX5 mRNA (major alt.) mRNA 514 > 1002 COX5 mRNA (major alt.) mRNA 515 > 1002 COX5 mRNA (major alt.) mRNA 516 > 1002 COX5 mRNA (minor alt.) mRNA 517 > 1002 COX5 mRNA (minor alt.) mRNA 524 > 1002 COX5 mRNA (minor alt.) mRNA 525 > 1002 COX5 mRNA (minor alt.) mRNA 526 > 1002 COX5 mRNA (minor alt.) mRNA 527 > 1002 COX5 mRNA (minor alt.) mRNA 528 > 1002 COX5 mRNA (minor alt.) mRNA 529 > 1002 COX5 mRNA (minor alt.) BASE COUNT 520 a 326 c 327 g 468 t ORIGIN 1 bp upstream of NcoI site. 1 ccatggtaac gaatctatca tcgtcgccga atgacagttc tgtcaattct tcggaagtaa 61 cgccaagaac tcctgctacg ttgactggag caaggaccgc actggccaca gaacgcgggg 121 aagatgatga gcactgtaaa agtttgtctc aacccgcaga ttcactggaa gcttctgtgg 181 acaacgaatc aatatctact gccccggaac agatgatgtt tcttccttaa gggacagcat 241 ctcccaaaag aatctttatg actatcagat ctgcaggcag accgcctccc tacgcttcta 301 aatagaatgt acttagcact tctgcctaca tagtaattat tcatcattgt aagtgtcaca 361 ttgcaaaaat cgtccaatag acgttttcct cgtttgtgat tggccctacg ttttgcgccg 421 atacaagcgg cggcgaaaag ttaaaaagaa caatcaatta aataagaacc atcagtcctc 481 gtattttgtc ttctgtttgc aagcaatgca ccaaacacaa gatcaactaa gaacgcatct 541 acaatgttac gtaacacttt tactagagct ggtggactat cacgtattac atccgtaaga 601 ttcgctcaaa cacatgctct ttccaacgct gctgtaatgg atctgcaatc cagatgggag 661 aacatgccct ccactgagca gcaggatatt gtcagtaagt tgagtgaacg tcaaaaatta 721 ccatgggcac agcttactga gcctgaaaag caagctgtgt ggtacatttc ttacggagaa 781 tggggcccaa gaagacctgt attgaataag ggtgattcca gttttattgc caaaggtgtt 841 gctgcaggcc tactattttc agtgggactt tttgctgtcg tcaggatggc gggtggccaa 901 gacgcaaaga ccatgaataa ggagtggcag ctaaagagtg acgaatattt gaagtcgaag 961 aatgctaatc cttggggtgg ttattctcag gtccaatcta aatgaacagg ccagaaattt 1021 tgaaaggaaa tcaatagggg ttaacgattg tcatggtttt ttcagctagt ctgtgacctg 1081 tacgaaagtg aatatcttat tacattataa gtgtatccat gggcatccgc ccaatacaat 1141 gccaacatat caaacataaa attggctgat tgccactctc acattttttc tttatttatt 1201 tactcaaatt ttgtaatttt ttgttagaca tataatttta tatcattatt cttattattc 1261 ttatatttaa gggaaccccc ctggaatgaa ataagctaat atagtatgag ggaatcctag 1321 tataaatggg tttaccttgt tattctatgc ctccctttca aagacgtatt tcttaaaaac 1381 ttctccattt ggttgaatac tatgaaaaaa aaaaaatcaa ctagaataga tatggaagat 1441 aaaaatgtgt aacaaaaaag aagaaaagag ctggaggtat gacaatagcg ccaatggcaa 1501 acgatttaga agattttgag tctctgctgg agcctgattt tgatgctaaa caatttggta 1561 atgacttact gaaggctact aataacaatg acacaaccat tttagacctt aacacgcctc 1621 ttaaaaagct aaactatgat c // LOCUS YSCRPO21 6224 bp ds-DNA PLN 22-SEP-1986 DEFINITION Yeast (S.cerevisiae) RPO21 gene encoding RNA polymerase II large subunit, complete. ACCESSION M11190 KEYWORDS RNA polymerase; polymerase. SOURCE S.cerevisiae DNA. ORGANISM Saccharomyces cerevisiae Eukaryota; Plantae; Thallobionta; Eumycota; Hemiascomycetes; Endomycetales; Saccharomycetaceae. REFERENCE 1 (bases 1 to 6224) AUTHORS Allison,L.A., Moyle,M., Shales,M. and Ingles,C.J. TITLE Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases JOURNAL Cell 42, 599-610 (1985) STANDARD full staff_review REFERENCE 2 (bases 1 to 6224; revises [1]) AUTHORS Ingles,C.J. JOURNAL Unpublished (1986) U. of Toronto, Ontario, Canada STANDARD full staff_review COMMENT A draft entry and computer-readable sequence from [1] were kindly provided by C.J.Ingles 04-FEB-1986. FEATURES from to/span description pept 313 5493 RNA polymerase II large subunit revision 6219 6220 ag in [2]; atg in [1] revision 6224 6224 t in [2]; a in [1] BASE COUNT 1903 a 1230 c 1214 g 1877 t ORIGIN 1 bp upstream of EcoRI site. 1 gaattccctg atcaactttc aaggaaaaac taaaactact gtattataag aggttttttc 61 acttccagat taattttgaa atacgatatc ctcaagttta tctaccagaa tatttgacta 121 agaaatcaaa ctctgttaat aataatataa ttataaaaac ctcaactaga aactccaaaa 181 aaaaaaattt accatttttt actttctatc cttgttaacc aaatttcaaa aaaattttac 241 cttttctttt tccagaagag ggaccaatca taaagatagt aataacactt taccccaaaa 301 tataaatcag acatggtagg acaacagtat tctagtgctc cactccgtac agtaaaagag 361 gtccaattcg gtcttttctc acctgaagaa gttagagcaa tcagtgtggc caaaattagg 421 tttccagaga caatggatga aacccagacg agagcgaaaa ttggtggtct aaacgaccct 481 aggttaggct ctattgatcg taatctgaag tgtcaaactt gtcaagaggg tatgaacgaa 541 tgtcctggtc attttggtca catagattta gcaaaacctg tatttcatgt tggttttatt 601 gccaaaatta agaaagtatg tgagtgtgtc tgtatgcact gtggtaagct attactggat 661 gaacataatg aattaatgag acaagctcta gcaatcaaag acagtaaaaa aaggtttgct 721 gcaatttgga ctttatgtaa aacaaaaatg gtctgcgaaa cagatgtccc ttctgaagat 781 gaccctactc agctcgtatc aaggggaggt tgtggtaata cacagcctac aattcgtaag 841 gatgggttga aattagttgg tagttggaaa aaagatagag ccacggggga tgcggatgaa 901 ccagaactaa gagttttaag tacggaggaa atcttgaata tttttaagca tatctcagta 961 aaagacttca ctagtttggg tttcaacgaa gttttttctc gtccagaatg gatgatttta 1021 acatgccttc ctgtcccacc accaccggtg cgtccatcca tttccttcaa tgaatctcaa 1081 agaggtgagg atgatttaac ctttaaactt gctgatattt taaaagctaa tattagtttg 1141 gaaacactag agcataacgg tgctccacat catgctattg aagaagcaga gagtttatta 1201 caatttcatg ttgccactta tatggataat gatattgctg gtcaaccaca agctcttcaa 1261 aagtccggcc gtcccgttaa atctattcgt gctcgtttga agggtaaaga gggtcgtatc 1321 agaggtaatt taatgggtaa gcgtgtggat ttttcggcaa gaactgttat ttctggtgat 1381 cctaatttgg aattagacca agtcggtgtt ccaaaatcta ttgccaagac tttaacatac 1441 ccagaagtgg tcacaccata taacatagat cgtctgacgc aacttgttag gaatggacca 1501 aatgaacacc ccggtgccaa atacgtcatt cgtgatagcg gagaccgtat agatttaaga 1561 tacagtaaaa gggcaggtga tattcaatta cagtatgggt ggaaagttga acgtcatatt 1621 atggacaatg atccagtttt attcaaccgt caaccttcgt tgcacaaaat gtccatgatg 1681 gcccacagag taaaagttat tccatattct acatttagat tgaatttgtc cgttacatct 1741 ccatacaatg ccgatttcga cggtgacgaa atgaatcttc acgttcctca gtctgaggaa 1801 acaagggcgg aactttctca attatgtgct gttcctctac aaattgtttc accacaatct 1861 aacaaacctt gtatgggtat tgttcaagat actttgtgtg gtattcgtaa actgacatta 1921 agagatacat ttatagaact tgatcaagtt ttgaatatgc tttattgggt tccagattgg 1981 gatggtgtta ttccgacacc tgcaattatc aagcccaaac ctttgtggtc cggtaaacaa 2041 atcttgtctg tggctatccc aaacggtatt catttacaac gttttgatga gggcactact 2101 ctgctttctc caaaggataa tggtatgctt attattgacg gtcaaatcat ttttggtgta 2161 gtagagaaaa aaaccgttgg ttcctccaat ggtggtttaa ttcatgttgt tacaagagaa 2221 aagggacctc aagtttgtgc taagttgttt ggtaacatac agaaagttgt taacttttgg 2281 ttactacata atgggttttc aacaggtatt ggtgatacca ttgcggacgg cccaacaatg 2341 agggaaatta cagagacaat tgcagaggct aaaaagaaag ttttggatgt tacgaaagaa 2401 gcccaggcaa acttattgac tgctaaacat ggtatgactc tccgtgagtc ttttgaggat 2461 aacgttgttc ggttcctaaa tgaagcaaga gataaggcag gtcgtttagc tgaagtcaat 2521 ttgaaagatt tgaacaatgt gaaacaaatg gttatggcag gttccaaggg ttcatttatt 2581 aatatcgcgc aaatgtcagc ttgtgtagga cagcaatctg ttgaaggtaa acgtattgct 2641 tttgggttcg ttgatcgtac cttacctcat ttctctaaag atgattactc cccagagtct 2701 aaaggttttg ttgagaactc atatttgaga ggtttgaccc cacaagaatt ttttttccat 2761 gcaatgggtg gtcgtgaagg tcttatcgat accgccgtca aaacagccga aacaggttat 2821 attcaacgtc gtttagtgaa agctctagaa gatatcatgg ttcattacga taacaccaca 2881 agaaactcat tgggtaacgt tattcagttt atttatggtg aagatggtat ggatgctgcg 2941 catattgaaa agcaatcgct agatactatt ggtggctccg atgcagcttt tgaaaagaga 3001 tacagagttg atttattgaa tacagaccat acccttgatc cctcactatt ggaatccgga 3061 tctgagatac ttggcgattt gaaacttcaa gttctcctgg atgaagaata caaacaatta 3121 gtgaaagatc gtaaattttt gagggaagtt tttgttgatg gtgaagcaaa ctggccatta 3181 ccagtcaaca taagacgtat tattcaaaat gctcaacaaa ctttccacat agatcatacg 3241 aaaccatctg atttaacaat caaagacatc gttcttggtg taaaggattt gcaagaaaac 3301 ttattagtgt tgcgtggtaa gaatgaaatt atacaaaatg cccagcgaga tgcagttaca 3361 ttgttctgct gtttattacg ttcccgtttg gccacacgta gagttctgca agagtacaga 3421 ctaacaaaac aggcattcga ttgggtatta agtaatatcg aggcacaatt cctccgttct 3481 gttgttcacc ctggtgaaat ggttggtgtt ctagcagccc aatccattgg tgaaccagcc 3541 acacaaatga cccttaacac cttccatttt gctggtgttg cttccaaaaa agttacttct 3601 ggtgtccccc gtttaaagga aattttgaat gtggccaaaa acatgaaaac gccttccttg 3661 actgtatact tagagcctgg tcatgctgcc gatcaagaac aagcgaagtt gatcagatct 3721 gctatcgagc ataccacttt aaagagtgtc actattgctt cagaaattta ctatgatcct 3781 gatccacgtt ccacagttat tccagaagat gaagaaatta tccaacttca tttctcatta 3841 ttggatgaag aagctgaaca atcttttgac caacaatcac cttggttatt acgtctggaa 3901 ctggatcgtg cagcaatgaa tgataaagac ttaacaatgg gtcaggttgg tgaaagaatc 3961 aagcaaacat tcaaaaatga tttgtttgtt atctggtctg aagacaacga tgagaagttg 4021 atcatccgtt gtcgtgttgt tcgtccaaag tcactagatg ctgagactga agcagaagaa 4081 gatcatatgt tgaagaaaat tgagaacaca atgttagaga atattacatt acgtggtgta 4141 gagaacatcg agcgtgttgt catgatgaaa tatgaccgta aagtaccaag tccaactggt 4201 gaatacgtta aggaacctga atgggtgttg gaaacagatg gtgttaactt atctgaagtt 4261 atgactgttc ctggtatcga cccaaccaga atctatacca actccttcat tgatataatg 4321 gaagttctag gtattgaagc tggtcgtgca gccttgtata aagaagttta caatgttatt 4381 gcttctgatg gttcgtatgt taactaccgt catatggctt tgttagtcga tgttatgaca 4441 acccaaggtg gcttaacttc tgttactcgt catggtttca acagatcaaa tacaggtgcc 4501 ttaatgagat gttcatttga agaaactgtc gaaattttgt ttgaagctgg tgcttcagcc 4561 gaattagatg attgtcgtgg tgtttcggaa aatgtcattc ttggtcaaat ggctccaatt 4621 ggtaccggtg catttgatgt gatgatcgat gaggagtcac tggtaaaata catgccagaa 4681 caaaaaataa ctgagattga agacggacaa gatggtggcg tcacaccata cagtaacgaa 4741 agtggtttgg tcaatgcaga tcttgacgtt aaagatgagc taatgttttc acctctggtt 4801 gattcgggtt caaatgacgc tatggctgga ggatttacag cgtacggtgg tgttgattat 4861 ggtgaagcca cgtctccatt tgctgcttat ggtgaagcac ctacatctcc cggatttgga 4921 gtctcctcac caggcttttc tccaacttcc ccaacatact ctcctacctc tccagcgtac 4981 tcaccaacat caccatcgta ctcgccaaca tcaccatcgt attcaccaac gtcaccatca 5041 tattcgccaa cgtcaccatc atattcgcca acgtcgccat cgtattctcc aacgtcacca 5101 tcgtattcgc caatgtcgcc ttcctactct cccacgtcgc caagctacag ccctacgtcg 5161 ccaagctaca gccctacgtc tccttcttat tctcctacat ctccatcata ctctcctacg 5221 tcaccaagtt acagcccaac gtcaccaagt tacagcccaa cgtctccagc ctattcccca 5281 acatcaccaa gttatagtcc tacatcgcct tcatactctc caacgtcacc atcctattcc 5341 ccaacatcac cttcttactc tcccacctct ccaaactata gccctacttc accttcttac 5401 tccccaacat ctccaggcta cagcccagga tctcctgcat attctccaaa gcaagacgaa 5461 caaaagcata atgaaaatga aaattccaga tgatatagta tatcatcctt acgtatttga 5521 cgttattaca ttatatatag tttctcaaat aatatttcta gtttattttt gtatcataat 5581 aaaaacgtat accaaatata ccattatttt tcataacatt atggtaggga tagggaatca 5641 agtaactaat ttatatccgc agagcattgg gaaaaccaac ggcgctagta aatgcattta 5701 aattacgtcc gtccaacttc taagcttcaa tggtagactc ttaactctga cctttttagc 5761 aattaagctc ttgaagatat caaaagtgtt accgtccggc tgtaaattat aaacgtttcc 5821 tgtaaattga gtggaatacc gcttaccatt cttttgcaat cagtaaaccg tagtcttccg 5881 tgataccagt aatcatggct tgcgtatttc cgtgatctgg taatgttact atttggttac 5941 tatgtaacac aactcataat aacttggcaa tatttccgca gctccgtagt taataaactg 6001 ttttaatatg acctcaaggt tattcatata gagtgcctgc agtttttctg cctttattgc 6061 tggcaataaa tcaaggtgta attgttggcg ttcttcattc aggatatcaa tccaagtttg 6121 taatgaagtt gtaggaccat cactagtcaa atttatacca cagccaagta gcaaacaata 6181 tttattgttt atgaagtggg tattaactaa taaaccagag atct //