Path: utzoo!attcan!uunet!snorkelwacker!apple!bionet!life!mia From: mia@life (Mia McLeod) Newsgroups: bionet.molbio.genbank.updates Subject: (none) Message-ID: <9003222314.AA09841@life.LANL.GOV> Date: 22 Mar 90 23:14:47 GMT Sender: daemon@genbank.BIO.NET Lines: 2193 Approved: lear@genbank.bio.net LOCUS ADBCG 35937 bp ds-DNA VRL 15-JUN-1989 DEFINITION Adenovirus type 2, complete genome. ACCESSION J01917 V00007 V00010 V00018 V00019 V00020 J01949 J01953 J01918 V00011 J01919 J01920 J01921 V00012 J01922 V00013 J01923 V00014 J01924 J01925 J01926 J01927 J01928 J01929 V00009 J01930 V00016 J01931 J01932 J01933 J01934 J01935 J01936 J01937 J01938 J01939 J01940 J01941 J01942 J01943 J01944 J01945 J01946 V00024 J01947 J01948 V00015 J01950 J01951 K00086 J01952 K00394 K00395 J01954 V00023 J01955 J01956 V00017 J01957 V00008 K02367 M13004 KEYWORDS DNA binding protein; DNA polymerase; RNA polymerase III; alternate splicing; coat protein; complete genome; genome-linked protein; glycoprotein; overlapping genes; polymerase; terminal repeat; unidentified reading frame; viral associated RNA. SOURCE Adenovirus type 2 DNA, cDNA, RNA and mRNA (when the material is not simply DNA, such is indicated on the reference line). ORGANISM Mastadenovirus 2 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 10610 to 10766; RNA) AUTHORS Ohe,K. and Weissman,S.M. TITLE The nucleotide sequence of a low molecular weight ribonucleic acid from cells infected with adenovirus 2 JOURNAL J. Biol. Chem. 246, 6991-7009 (1971) STANDARD full staff_review REFERENCE 2 (sites; cds start for the hexon protein) AUTHORS Joernvall,H., Ohlsson,H. and Philipson,L. TITLE An acetylated N-terminus of adenovirus type 2 hexon protein JOURNAL Biochem. Biophys. Res. Commun. 56, 304-310 (1974) STANDARD full staff_review REFERENCE 3 (bases 10681 to 10813) AUTHORS Celma,M.L., Pan,J. and Weissman,S.M. TITLE Studies of low molecular weight RNA from cells infected with adenovirus 2: I. The sequences at the 3' end of VA-RNA I JOURNAL J. Biol. Chem. 252, 9032-9042 (1977) STANDARD full staff_review REFERENCE 4 (sites; 5' terminus of VA I RNA) AUTHORS Celma,M.L., Pan,J. and Weissman,S.M. TITLE Studies of low molecular weight RNA from cells infected with adenovirus 2: II. Heterogeneity at the 5' end of VA-RNA I JOURNAL J. Biol. Chem. 252, 9043-9046 (1977) STANDARD full staff_review REFERENCE 5 (bases 10514 to 10680) AUTHORS Pan,J., Celma,M.L. and Weissman,S.M. TITLE Studies of low molecular weight RNA from cells infected with adenovirus 2: III.The sequence of the promoter for VA-RNA I JOURNAL J. Biol. Chem. 252, 9047-9054 (1977) STANDARD full staff_review REFERENCE 6 (bases 18778 to 18918) AUTHORS Akusjaervi,G. and Pettersson,U. TITLE Nucleotide sequence at the junction between the coding region of the adenovirus 2 hexon messenger RNA and its leader sequence JOURNAL Proc. Natl. Acad. Sci. U.S.A. 75, 5822-5826 (1978) STANDARD full staff_review REFERENCE 7 (bases 5986 to 6236; mRNA and DNA) AUTHORS Ziff,E.B. and Evans,R.M. TITLE Coincidence of the promoter and capped 5' terminus of RNA from the adenovirus 2 major late transcription unit JOURNAL Cell 15, 1463-1475 (1978) STANDARD full staff_review REFERENCE 8 (bases 21607 to 21816) AUTHORS Akusjaervi,G. and Pettersson,U. TITLE Sequence analysis of adenovirus DNA: I. Nucleotide sequence at the carboxy-terminal end of the gene for adenovirus type 2 hexon JOURNAL Virology 91, 477-480 (1978) STANDARD full staff_review REFERENCE 9 (bases 25634 to 27376) AUTHORS Galibert,F., Herisse,J. and Courtois,G. TITLE Nucleotide sequence of the EcoRI-F fragment of adenovirus 2 genome JOURNAL Gene 6, 1-22 (1979) STANDARD full staff_review REFERENCE 10 (bases 6039 to 6079; 7101 to 7172; 9634 to 9723; 18802 to 18861; cDNA to hexon mRNA) AUTHORS Akusjaervi,G. and Pettersson,U. TITLE Sequence analysis of adenovirus DNA: complete nucleotide sequence of the spliced 5' noncoding region of adenovirus 2 hexon messenger RNA JOURNAL Cell 16, 841-850 (1979) STANDARD full staff_review REFERENCE 11 (bases 6039 to 31095; several fragments over this span; cDNA and DNA) AUTHORS Zain,S., Sambrook,J., Roberts,R.J., Keller,W., Fried,M. and Dunn,A.R. TITLE Nucleotide sequence analysis of the leader segments in a cloned copy of adenovirus 2 fiber mRNA JOURNAL Cell 16, 851-861 (1979) STANDARD full staff_review REFERENCE 12 (bases 5848 to 6578) AUTHORS Baker,C.C. and Ziff,E.B. TITLE Biogenesis, structures, and sites of encoding of the 5' termini of adenovirus-2 mRNAs JOURNAL Cold Spring Harb. Symp. Quant. Biol. 44, 415-428 (1980) STANDARD full staff_review REFERENCE 13 (bases 26977 to 27178; mRNA and DNA) AUTHORS Baker,C.C., Herisse,J., Courtois,G., Galibert,F. and Ziff,E. TITLE Messenger RNA for the ad2 DNA binding protein: DNA sequences encoding the first leader and heterogeneity at the mRNA 5' end JOURNAL Cell 18, 569-580 (1979) STANDARD full staff_review REFERENCE 14 (bases 1 to 110; 35835 to 35937) AUTHORS Shinagawa,M. and Padmanabhan,R.V. TITLE Nucleotide sequence at the inverted terminal repetition of adenovirus type 2 DNA JOURNAL Biochem. Biophys. Res. Commun. 87, 671-678 (1979) STANDARD full staff_review REFERENCE 15 (bases 513 to 1111; 1226 to 1630; cDNA) AUTHORS Perricaudet,M., Akusjaervi,G., Virtanen,A. and Pettersson,U. TITLE Structure of two spliced mRNAs from the transforming region of human subgroup C adenoviruses JOURNAL Nature 281, 694-696 (1979) STANDARD full staff_review REFERENCE 16 (bases 1 to 156; 35804 to 35937) AUTHORS Arrand,J.R. and Roberts,R.J. TITLE The nucleotide sequences at the termini of adenovirus-2 DNA JOURNAL J. Mol. Biol. 128, 577-594 (1979) STANDARD full staff_review REFERENCE 17 (sites; acceptor splice site for fiber mRNA) AUTHORS Zain,B.S. and Roberts,R.J. TITLE Sequences from the beginning of the fiber messenger RNA of adenovirus-2 JOURNAL J. Mol. Biol. 131, 341-352 (1979) STANDARD full staff_review REFERENCE 18 (bases 5909 to 6178; 7023 to 7212; 9452 to 9836) AUTHORS Akusjaervi,G. and Pettersson,U. TITLE Sequence analysis of adenovirus DNA: IV. The genomic sequences encoding the common tripartite leader of late adenovirus messenger RNA JOURNAL J. Mol. Biol. 134, 143-158 (1979) STANDARD full staff_review REFERENCE 19 (bases 6039 to 31080; several leader fragments over this span) AUTHORS Zain,B.S., Gingeras,T.R., Bullock,P., Wong,G. and Gelinas,R.E. TITLE Determination and analysis of adenovirus-2 DNA sequences which may include signals for late messenger RNA processing JOURNAL J. Mol. Biol. 135, 413-433 (1979) STANDARD full staff_review REFERENCE 20 (bases 27373 to 30050) AUTHORS Herisse,J., Courtois,G. and Galibert,F. TITLE Nucleotide sequence of the EcoRI D fragment of adenovirus 2 genome JOURNAL Nucleic Acids Res. 8, 2173-2192 (1980) STANDARD full staff_review REFERENCE 21 (bases 35360 to 35937) AUTHORS Shinagawa,M. and Padmanabhan,R.V. TITLE The nucleotide sequence of the right-hand terminal SmaI-K fragment of adenovirus type 2 DNA JOURNAL Gene 9, 99-114 (1980) STANDARD full staff_review REFERENCE 22 (bases 22305 to 22600) AUTHORS Buettner,W. and Veres-Molnar,Z. TITLE Localization of the 3'-terminal end of the EcoRI B fragment-specific early mRNA of adenovirus type 2 JOURNAL FEBS Lett. 122, 317-321 (1980) STANDARD full staff_review REFERENCE 23 (bases 3504 to 4109) AUTHORS Alestroem,P., Akusjaervi,G., Perricaudet,M., Mathews,M.B., Klessig,D.F. and Pettersson,U. TITLE The gene for polypeptide IX of adenovirus type 2 and its unspliced messenger RNA JOURNAL Cell 19, 671-681 (1980) STANDARD full staff_review REFERENCE 24 (bases 10514 to 11065) AUTHORS Akusjaervi,G., Mathews,M.B., Andersson,P., Vennstroem,B. and Pettersson,U. TITLE Structure of genes for virus-associated RNA-I and RNA-II of adenovirus type 2 JOURNAL Proc. Natl. Acad. Sci. U.S.A. 77, 2424-2428 (1980) STANDARD full staff_review REFERENCE 25 (sites; splice sites for E1b mRNAs) AUTHORS Perricaudet,M., Le Moullec,J.-M. and Pettersson,U. TITLE Predicted structure of two adenovirus tumor antigens JOURNAL Proc. Natl. Acad. Sci. U.S.A. 77, 3778-3782 (1980) STANDARD full staff_review REFERENCE 26 (sites; cds start for E3 19K glycoprotein) AUTHORS Persson,H., Joernvall,H. and Zabielski,J. TITLE Multiple mRNA species for the precursor to an adenovirus-encoded glycoprotein: Identification and structure of the signal sequence JOURNAL Proc. Natl. Acad. Sci. U.S.A. 77, 6349-6353 (1980) STANDARD full staff_review REFERENCE 27 (sites; cds start for 15K, IX and fiber polypeptides) AUTHORS Anderson,C.W. and Lewis,J.B. TITLE Amino-terminal sequence of adenovirus type 2 proteins: Hexon, fiber, component IX, and early protein 1B-15K JOURNAL Virology 104, 27-41 (1980) STANDARD full staff_review REFERENCE 28 (bases 21607 to 22770) AUTHORS Akusjaervi,G., Zabielski,J., Perricaudet,M. and Pettersson,U. TITLE The sequence of the 3' noncoding region of the hexon mRNA discloses a novel adenovirus gene JOURNAL Nucleic Acids Res. 9, 1-17 (1981) STANDARD full staff_review REFERENCE 29 (bases 30047 to 32268) AUTHORS Herisse,J. and Galibert,F. TITLE Nucleotide sequence of the EcoRI E fragment of adenovirus 2 genome JOURNAL Nucleic Acids Res. 9, 1229-1240 (1981) STANDARD full staff_review REFERENCE 30 (sites; cap site for E4 mrnas) AUTHORS Hashimoto,S., Pursley,M.H. and Green,M. TITLE Nucleotide sequences and mapping of novel heterogenous 5'-termini of adenovirus 2 early region 4 mRNA JOURNAL Nucleic Acids Res. 9, 1675-1689 (1981) STANDARD full staff_review REFERENCE 31 (bases 32263 to 35937) AUTHORS Herisse,J., Rigolet,M., Dupont De Dinechin,S. and Galibert,F. TITLE Nucleotide sequence of adenovirus 2 DNA fragment encoding for the carboxylic region of the fiber protein and the entire E4 region JOURNAL Nucleic Acids Res. 9, 4023-4042 (1981) STANDARD full staff_review REFERENCE 32 (sites; splice sites in E2a mRNA) AUTHORS Kruijer,W., Van Schaik,F.M.A. and Sussenbach,J.S. TITLE Structure and organization of the gene coding for the DNA binding protein of adenovirus type 5 JOURNAL Nucleic Acids Res. 9, 4439-4457 (1981) STANDARD full staff_review REFERENCE 33 (bases 5817 to 6051; 35358 to 35707) AUTHORS Baker,C.C. and Ziff,E.B. TITLE Promoters and heterogeneous 5' termini of the messenger RNAs of adenovirus serotype 2 JOURNAL J. Mol. Biol. 149, 189-221 (1981) STANDARD full staff_review REFERENCE 34 (bases 18838 to 21744; fragments over this span) AUTHORS Joernvall,H., Alestroem,P., Akusjaervi,G., Von Bahr-Lindstroem,H., Philipson,L. and Pettersson,U. TITLE Order of the CNBr fragments in the adenovirus hexon protein JOURNAL J. Biol. Chem. 256, 6204-6212 (1981) STANDARD full staff_review REFERENCE 35 (bases 459 to 608) AUTHORS Osborne,T.F., Schell,R.E., Burch-Jaffe,E., Berget,S.J. and Berk,A.J. TITLE Mapping a eukaryotic promoter: a DNA sequence required for in vivo expression of adenovirus pre-early functions JOURNAL Proc. Natl. Acad. Sci. U.S.A. 78, 1381-1385 (1981) STANDARD full staff_review REFERENCE 36 (bases 17878 to 18918) AUTHORS Akusjaervi,G. and Persson,H. TITLE Gene and mRNA for precursor polypeptide VI from adenovirus type 2 JOURNAL J. Virol. 38, 469-482 (1981) STANDARD full staff_review REFERENCE 37 (sites; splice site in 52,55K-pept mRNA) AUTHORS Akusjaervi,G. and Persson,H. TITLE Controls of RNA splicing and termination in the major late adenovirus transcription unit JOURNAL Nature 292, 420-426 (1981) STANDARD full staff_review REFERENCE 38 (sites; splice sites in IVa2 mRNA, Ad5) AUTHORS Van Beveren,C.P., Maat,J., Dekker,B.M.M. and Van Ormondt,H. TITLE The nucleotide sequence of the gene for protein IVa2 and of the 5' leader segment of the major late mRNAs of adenovirus type 5 JOURNAL Gene 16, 179-189 (1981) STANDARD full staff_review REFERENCE 39 (bases 7869 to 8420) AUTHORS Virtanen,A., Alestroem,P., Persson,H., Katze,M.G. and Pettersson,U. TITLE An adenovirus agnogene JOURNAL Nucleic Acids Res. 10, 2539-2548 (1982) STANDARD full staff_review REFERENCE 40 (bases 22469 to 24125) AUTHORS Kruijer,W., Van Schaik,F.M.A. and Sussenbach,J.S. TITLE Nucleotide sequence of the gene encoding adenovirus type 2 DNA binding protein JOURNAL Nucleic Acids Res. 10, 4493-4500 (1982) STANDARD full staff_review REFERENCE 41 (bases 27609 to 27980; 28376 to 29792; cDNA and DNA) AUTHORS Ahmed,C.M.I., Chanda,R.S., Stow,N.D. and Zain,B.S. TITLE The nucleotide sequence of mRNA for the M-r 19000 glycoprotein from early gene block III of adenovirus 2 JOURNAL Gene 20, 339-346 (1982) STANDARD full staff_review REFERENCE 42 (bases 1 to 11600; 32092 to 35937) AUTHORS Gingeras,T.R., Sciaky,D., Gelinas,R.E., Bing-Dong,J., Yen,C.E., Kelly,M.M., Bullock,P.A., Parsons,B.L., O'Neill,K.E. and Roberts,R.J. TITLE Nucleotide sequences from the adenovirus-2 genome JOURNAL J. Biol. Chem. 257, 13475-13491 (1982) STANDARD full staff_review REFERENCE 43 (bases 5778 to 11560) AUTHORS Alestroem,P., Akusjaervi,G., Pettersson,M. and Pettersson,U. TITLE DNA sequence analysis of the region encoding the terminal protein and the hypothetical N-gene product of adenovirus type 2 JOURNAL J. Biol. Chem. 257, 13492-13498 (1982) STANDARD full staff_review REFERENCE 44 (sites; splice site for 'i' leader) AUTHORS Uhlen,M., Svensson,C., Josephson,S., Alestroem,P., Chattopadhyaya,J.B., Pettersson,U. and Philipson,L. TITLE Leader arrangement in the adenovirus fiber mRNA JOURNAL EMBO J. 1, 249-254 (1982) STANDARD full staff_review REFERENCE 45 (bases 1517 to 1696; 3932 to 4112; 17880 to 17975; 21142 to 28259; mRNA and DNA) AUTHORS Fraser,N.W., Baker,C.C., Moore,M.A. and Ziff,E.B. TITLE Poly(A) sites of adenovirus serotype 2 transcription units JOURNAL J. Mol. Biol. 155, 207-233 (1982) STANDARD full staff_review REFERENCE 46 (sites; E1a mutational analysis) AUTHORS Osborne,T.F., Gaynor,R.B. and Berk,A.J. TITLE The TATA homology and the mRNA 5' untranslated sequence are not required for expression of essential adenovirus E1a functions JOURNAL Cell 29, 139-148 (1982) STANDARD full staff_review REFERENCE 47 (bases 7929 to 8423) AUTHORS Falvey,E. and Ziff,E. TITLE Sequence arrangement and protein coding capacity of the adenovirus type 2 "i" leader JOURNAL J. Virol. 45, 185-191 (1983) STANDARD full staff_review REFERENCE 48 (sites; splice sites for 33K mRNA) AUTHORS Oosterom-Dragon,E.A. and Anderson,C.W. TITLE Polypeptide structure and encoding location of the adenovirus serotype 2 late, nonstructural 33K protein JOURNAL J. Virol. 45, 251-263 (1983) STANDARD full staff_review REFERENCE 49 (sites; cds start for E4 11K-pept, ad5) AUTHORS Downey,J.F., Rowe,D.T., Bacchetti,S., Graham,F.L. and Bayley,S.T. TITLE Mapping of a 14,000-Dalton antigen to early region 4 of the human adenovirus 5 genome JOURNAL J. Virol. 45, 514-523 (1983) STANDARD full staff_review REFERENCE 50 (sites; cds start for the 13.6K-pept) AUTHORS Lewis,J.B. and Anderson,C.W. TITLE Proteins encoded near the adenovirus late messenger RNA leader segments JOURNAL Virology 127, 112-123 (1983) STANDARD full staff_review REFERENCE 51 (sites; splice sites for 72K and 100K mRNAs) AUTHORS Kruijer,W., Van Schaik,F.M.A., Speijer,J.G. and Sussenbach,J.S. TITLE Structure and function of adenovirus DNA binding protein: Comparison of the amino acid sequences of the ad5 and ad12 proteins derived from the nucleotide sequence of the corresponding genes JOURNAL Virology 128, 140-153 (1983) STANDARD full staff_review REFERENCE 52 (sites; splice sites for leaders; poly-A sites) AUTHORS Stalhandske,P., Persson,H., Perricaudet,M., Philipson,L. and Pettersson,U. TITLE Structure of three spliced mRNAs from region E3 of adenovirus type 2 JOURNAL Gene 22, 157-165 (1983) STANDARD full staff_review REFERENCE 53 (sites; splice sites for E1a mRNAs) AUTHORS Virtanen,A. and Pettersson,U. TITLE The molecular structure of the 9S mRNA from early region 1A of adenovirus serotype 2 JOURNAL J. Mol. Biol. 165, 496-499 (1983) STANDARD full staff_review REFERENCE 54 (bases 13898 to 14231) AUTHORS Le Moullec,J.-M., Akusjaervi,G., Stalhandske,P., Pettersson,U., Chambraud,B., Gilardi,P., Nasri,M. and Perricaudet,M. TITLE Polyadenylic acid addition sites in the adenovirus type 2 major late transcription unit JOURNAL J. Virol. 48, 127-134 (1983) STANDARD full staff_review REFERENCE 55 (bases 17539 to 18177) AUTHORS Sung,M.T., Cao,T.M., Lischwe,M.A. and Coleman,R.T. TITLE Molecular processing of adenovirus proteins JOURNAL J. Biol. Chem. 258, 8266-8272 (1983) STANDARD full staff_review REFERENCE 56 (bases 15821 to 16495) AUTHORS Sung,M.T., Cao,T.M., Coleman,R.T. and Budelier,K.A. TITLE Gene and protein sequences of adenovirus protein VII, a hybrid basic chromosomal protein JOURNAL Proc. Natl. Acad. Sci. U.S.A. 80, 2902-2906 (1983) STANDARD full staff_review REFERENCE 57 (sites; splice sites in E2 mRNA) AUTHORS Goldenberg,C.J. and Hauser,S.D. TITLE Accurate and efficient in vitro splicing of purified precursor RNAs specified by early region 2 of the adenovirus 2 genome JOURNAL Nucleic Acids Res. 11, 1337-1348 (1983) STANDARD full staff_review REFERENCE 58 (sites; H2ts1 mutation between 57.0% and 69.0%) AUTHORS Yeh-Kai,L., Akusjaervi,G., Alestroem,P., Pettersson,U., Tremblay,M. and Weber,J. TITLE Genetic identification of an endoproteinase encoded by the adenovirus genome JOURNAL J. Mol. Biol. 167, 217-222 (1983) STANDARD full staff_review REFERENCE 59 (bases 18616 to 19233) AUTHORS Mautner,V. and Boursnell,M.E.G. TITLE Recombination in adenovirus: DNA sequence analysis of crossover sites in intertypic recombinants JOURNAL Virology 131, 1-10 (1983) STANDARD full staff_review REFERENCE 60 (bases 31030 to 32775; H2ts125 strain) AUTHORS Boudin,M.-L., Rigolet,M., Lemay,P., Galibert,F. and Boulanger,P. TITLE Biochemical and genetical characterization of a fiber-defective temperature-sensitive mutant of type 2 adenovirus JOURNAL EMBO J. 2, 1921-1927 (1983) STANDARD full staff_review REFERENCE 61 (sites; cds start for E1a proteins) AUTHORS Downey,J.F., Evelegh,C.M., Branton,P.E. and Bayley,S.T. TITLE Peptide maps and N-terminal sequences of polypeptides from early region 1A of human adenovirus 5 JOURNAL J. Virol. 50, 30-37 (1984) STANDARD full staff_review REFERENCE 62 (sites; splice sites in E4 region) AUTHORS Tigges,M.A. and Raskas,H.J. TITLE Splice junctions in adenovirus 2 early region 4 mRNAs: Multiple splice sites produce 18 to 24 RNAs JOURNAL J. Virol. 50, 106-117 (1984) STANDARD full staff_review REFERENCE 63 (sites; splice sites in E4 region; poly-A site for E4 mRNAs) AUTHORS Freyer,G.A., Katoh,Y. and Roberts,R.J. TITLE Characterization of the major mRNAs from adenovirus 2 early region 4 by cDNA cloning and sequencing JOURNAL Nucleic Acids Res. 12, 3503-3519 (1984) STANDARD full staff_review REFERENCE 64 (bases 18838 to 21744) AUTHORS Akusjaervi,G., Alestroem,P., Pettersson,M., Lager,M., Joernvall,H. and Pettersson,U. TITLE The gene for the adenovirus 2 hexon polypeptide JOURNAL J. Biol. Chem. 259, 13976-13979 (1984) STANDARD full staff_review REFERENCE 65 (bases 15033 to 18316) AUTHORS Alestroem,P., Akusjaervi,G., Lager,M., Yeh-Kai,L. and Pettersson,U. TITLE Genes encoding the core proteins of adenovirus type 2 JOURNAL J. Biol. Chem. 259, 13980-13985 (1984) STANDARD full staff_review REFERENCE 66 (bases 11601 to 15726; 23924 to 25638) AUTHORS Roberts,R.J., O'Neill,K.E. and Yen,C.T. TITLE DNA sequences from the adenovirus-2 genome JOURNAL J. Biol. Chem. 259, 13968-13975 (1984) STANDARD full staff_review REFERENCE 67 (sites; cds start for 57K-pept) AUTHORS Anderson,C.W., Schmitt,R.C., Smart,J.E. and Lewis,J.B. TITLE Early region 1B of adenovirus serotype 2 encodes two co-terminal proteins of 495 and 155 amino acid residues JOURNAL Unpublished (1984) STANDARD full staff_review REFERENCE 68 (sites; splice sites in E4 region; poly-A site for E4 mRNAs) AUTHORS Virtanen,A., Alestroem,P., Persson,H., Katze,M.G. and Pettersson,U. JOURNAL Unpublished (1984) STANDARD full staff_review REFERENCE 69 (sites; splice sites in E1b region) AUTHORS Virtanen,A. and Pettersson,U. JOURNAL Unpublished (1984) STANDARD full staff_review REFERENCE 70 (review; bases 1 to 35937) AUTHORS Roberts,R.J., Akusjaervi,G., Alestroem,P., Gelinas,R.E., Gingeras,T.R., Sciaky,D. and Pettersson,U. TITLE A consensus sequence for the adenovirus-2 genome JOURNAL (in) Doerfler,W. (Ed.); Adenovirus DNA, 1-51: Martinus Nijhoff Publishing, Boston (1986). STANDARD full staff_review REFERENCE 71 (sites; recombination analysis of ad2 and ad5) AUTHORS Mautner,V. and Mackay,N. TITLE Recombination in adenovirus: Analysis of crossover sites in intertypic overlap recombinants JOURNAL Virology 139, 43-52 (1984) STANDARD full staff_review REFERENCE 72 (sites; splice sites in major late mRNA) AUTHORS Padgett,R.A., Konarska,M.M., Grabowski,P.J., Hardy,S.F. and Sharp,P.A. TITLE Lariat RNA's as intermediates and products in the splicing of messenger RNA precursors JOURNAL Science 225, 898-903 (1984) STANDARD full staff_review REFERENCE 73 (sites; IVa2 transcription start) AUTHORS Natarajan,V., Madden,M.J. and Salzman,N.P. TITLE Proximal and distal domains that control in vitro transcription of the adenovirus IVa2 gene JOURNAL Proc. Natl. Acad. Sci. U.S.A. 81, 6290-6294 (1984) STANDARD full staff_review REFERENCE 74 (sites; transcription start for EIa mRNAs) AUTHORS Leff,T., Elkaim,R., Goding,C.R., Jalinot,P., Sassone-Corsi,P., Perricaudet,M., Kedinger,C. and Chambon,P. TITLE Individual products of the adenovirus 12S and 13S EIa mRNAs stimulate viral EIIa and EIII expression at the transcriptional level JOURNAL Proc. Natl. Acad. Sci. U.S.A. 81, 4381-4385 (1984) STANDARD full staff_review REFERENCE 75 (sites; E3 11.6 -K protein) AUTHORS Wold,W.S.M., Cladaras,C., Magie,S.C. and Yacoub,N. TITLE Mapping a new gene that encodes an 11,600-molecular-weight protein in the E3 transcription unit of adenovirus 2 JOURNAL J. Virol. 52, 307-313 (1984) STANDARD full staff_review REFERENCE 76 (sites; L3 mRNA polyadenylation site) AUTHORS Moore,C.L. and Sharp,P.A. TITLE Accurate cleavage and polyadenylation of exogenous RNA substrate JOURNAL Cell 41, 845-855 (1985) STANDARD full staff_review REFERENCE 77 (bases 10610 to 10711; sites; L3 mRNA polyadenylation site) AUTHORS Cannon,R.E., Wu,G.-J. and Railey,J.F. TITLE Functions of and interactions between the A and B blocks in adenovirus type 2-specific VARNA1 gene JOURNAL Proc. Natl. Acad. Sci. U.S.A. 83, 1285-1289 (1986) STANDARD simple staff_review REFERENCE 78 (bases 30812 to 30900) AUTHORS Zain,B.S. and Roberts,R.J. TITLE Characterization and sequence analysis of a recombination site in the hybrid virus Ad2+ND JOURNAL J. Mol. Biol. 120, 13-31 (1978) STANDARD full staff_entry COMMENT Communicated on tape by R. Roberts. That tape and [70] are the immediate sources of the annotation herein. A consensus sequence for the l-strand of the genome is shown. Population heterogeneity as distinct from strain variation is known (35937 +/- 9 bp) [65]; both are annotated as "variation" below. For site differences with adenovirus type 5, see loci beginning which are arranged in the library according to the map coordinates of where one map unit corresponds to 360 bases throughout (see [42],[65]). For mutational changes in the ad2 sequence, see the appropriate references above. The origin of replication is located in the first fifty bases from each end. Transcription is leftward off the l-strand and rightward off the r-strand; in the former case, the annotation shows "(c)" for complementary strand. Complex splicing events give rise to perhaps fifty or more distinct mRNA transcripts at early, intermediate and late times after infection, many of which are still being characterized; in particular, some transcripts are known from electron microscopy which are not yet characterized at the sequence level. To date nine mRNA start sites (cap sites) have been identified, and these define the general units of mRNAs under which all known transcripts are classified. From the r-strand, the early transcripts are E1a, E1b and E3. The 28 kb late transcript called herein "major late mRNA" comprises five families, L1 through L5, of 3' co-terminal mRNAs. L1, and to a lesser extent L2, can be expressed at early and intermediate times [37]. Transcripts from this region contain a common tripartite leader sequence at their 5' ends: the three segments of this leader are encoded at bases 6039-6079, 7101-7172 and 9634-9723. At early and intermediate times, an extra leader segment, the 'i' leader, is frequently present (bases 7942-8381). The IX message, the only unspliced message in ad2, is intermediate, and its termination overlaps that for E1b on the same strand and that for IVa2, and most likely E2b, on the opposite strand. From the l-strand, or the "comp strand", early expression derives from the E2a, E2b and E4 families of mRNAs, although there can be late transcription from E2a. The E2b cap sites, splice sites and termination sites have not been determined at the sequence level. From electron microscopy there is evidence that the E2b mRNAs may originate at the E2a early cap site at 27092 (c) and terminate at the poly-A addition site found for the IVa2 mRNA at 4050 (c) [42]. IVa2 is an intermediate message. The promoters for these nine classes of mRNAs can be localized and characterized to the following extent [33]: mRNA cap site possible promoter region ------ ---------- ----------------------------- E1a 498 tatttata at 468-474 E1b 1699 tatataat at 1669-1676 IX 3576 tatataa at 3545-3551 major late 6039 tataaaa at 6008-6014 E3 27609 tataa at 27580-27584 E4 35609 (c) tatatata at 35641-35633 (c) E2a early 27092 (c) no obvious sequence for 100 bases upstream E2a late 25956 (c) tacaaattt at 25985-25977 (c) IVa2 5826 (c) no obvious sequence for 100 bases upstream The mRNA responsible for the 13.6K protein encoded at 7968 has not been identified. The VA I and VA II transcripts are unique in that they are generated by RNA polymerase III; for a discussion of these low molecular weight RNAs-- the modulation of their start points, their promoters, their heterogeneity and their similarity to tRNA-- see [3],[4],[5],[24] and . The proteins known to be encoded from these mRNAs are given in the Features table below, though the details of translation and processing have not been fully determined. In cases such as the IIIa peptide or the 11K peptide, the exact span of the coding awaits elucidation of the mRNA splicing. Some of these products share reading frames and therefore manifest partial homologies. The following table summarizes the unidentified reading frames ('URF') of 100 or more amino acids: initiator terminator frame protein encoded ----------- ---------- ------- ----------------- 6280 6600 1 11.6K URF 17284 17763 1 17.4K URF 23782 24138 1 12.9K URF 24481 24867 1 14.2K URF 26044 26826 1 28.6K URF(contains the N-terminus of 33K cds) 30973 32778 1 63.9K URF(contains the fiber cds) 10421 10834 2 14.4K URF 20504 20935 2 15.7K URF 27899 28222 2 12.4K URF 30059 30451 2 14.5k URF 33956 34456 2 18.8K URF 9294 9800 3 17.7K URF 23526 26525 3 110.2K URF(contains the 100K-pept cds) 30444 30830 3 14.7K URF 34470 34808 3 12.7K URF complementary strand --------------------------- 35532 35146 1 14.3K URF 34077 33193 1 34.1K URF 11109 10744 1 12.8K URF 9030 8383 1 22.8K URF 6780 6442 1 12.8K URF 31604 31290 2 10.7K URF 31211 30852 2 13.5K URF 18707 18159 2 18.9K URF 14861 14424 2 16.4K URF 14114 13728 2 13.5K URF 11618 11250 2 13.6K URF 1712 1194 2 18.1K URF 35113 34703 3 15.3K URF 34342 33998 3 13.3K URF 5674 5327 3 12.2K URF Additionally there are numerous unidentified reading frames of less than 100 amino acid residues; and further small modifications of a few of the coding sequences are possible. [78] missing data project. FEATURES from to/span description pept 559 636 E1a 6K protein from the 9s mRNA 1226 1315 E1a 6K protein from the 9s mRNA pept 559 973 E1a 26K protein from the 12s mRNA 1226 1542 E1a 26K protein from the 12s mRNA pept 559 1111 E1a 32K protein from the 13s mRNA 1226 1542 E1a 32K protein from the 13s mRNA pept 1711 2238 E1b 20.5K protein from the 13s mRNA pept 2016 3503 E1b 57K protein from the 22s mRNA (transformation) pept 2016 2249 E1b protein from the 1.31kb mRNA 3212 3256 E1b protein from the 1.31kb mRNA pept 2016 2249 E1b protein from the 1.26kb mRNA 3270 3503 E1b protein from the 1.26kb mRNA pept 3600 4022 IX protein (hexon-associated protein) pept 5708 5696 (c) IVa2 protein (virion morphogenesis) 5417 4081 (c) IVa2 protein (virion morphogenesis) (AA at 5415) pept 7968 8417 13.6K protein pept 8357 5187 (c) DNA polymerase pept 10534 8573 (c) terminal protein (Bellet protein) pept 11040 12287 52,55K protein pept 12308 14065 IIIa protein (peripentonal hexon-associated protein; splice sites not sequenced) pept 14151 15866 penton protein (virion component III) pept 15873 16469 Pro-VII protein (precursor to major core protein) pept 16539 17648 pV protein (minor core protein) pept 18001 18753 pVI protein (hexon-associated precursor) pept 18838 21744 hexon protein (virion component II) pept 21778 22392 23K protein (endopeptidase) pept 24079 22490 (c) DBP protein (DNA binding or 72K protein) pept 24108 26525 100K protein (hexon assembly) pept 26239 26551 33K protein (virion morphogenesis) 26754 27127 33K protein (virion morphogenesis) pept 27215 27898 pVIII protein (hexon-associated precursor) pept 28812 29291 E3 19K protein (glycosylated membrane protein) pept 29468 29773 E3 11.6K protein pept 31030 32778 fiber protein (virion component IV) pept 34706 34356 (c) E4 11K pr rotein (nuclear binding protein; splice sites not sequenced) RNA 10607 10766 VA I RNA (alt.) [4] RNA 10610 10766 VA I RNA (alt.) [1],[5],[24] RNA 10866 11023 VA II RNA [25],[24] pre-msg 498 1630 E1a mRNA [7],[15],[33] pre-msg 1699 4061 E1b mRNA [23],[33] mRNA 3576 4061 IX mRNA [23],[33],[42] mRNA 27092 4050 (c) E2b mRNA [42] pre-msg 5826 4050 (c) IVa2 mRNA [23],[33] pre-msg 6039 14113 major late mRNA L1 (alt.) [33],[54] pre-msg 6039 17969 major late mRNA L2 (alt.) [33],[36],[54] pre-msg 6039 22443 major late mRNA L3 (alt.) [28],[33],[54] pre-msg 6039 28223 major late mRNA L4 (alt.) [33],[54] pre-msg 6039 32798 major late mRNA L5 (alt.) [33],[54] mRNA 27609 29792 E3-1 mRNA (alt.) [33] mRNA 27609 29799 E3-1 mRNA (alt.) [41] mRNA 27609 29801 E3-1 mRNA (alt.) [41] mRNA 27609 29804 E3-1 mRNA (alt.) [41] mRNA 27609 30864 E3-2 mRNA; 85.88% [52] pre-msg 35609 32802 (c) E4 mRNA [30],[33],[63],[68] pre-msg 25954 22420 (c) E2a late mRNA (alt.) [33] pre-msg 25956 22420 (c) E2a late mRNA (alt.) [63],[68] pre-msg 27091 22420 (c) E2a early mRNA (alt.) [33] pre-msg 27092 22420 (c) E2a early mRNA (alt.) [33] IVS 637 1225 E1a (9S) intron [15],[53] IVS 974 1225 E1a (12S) intron [15],[69] IVS 1112 1225 E1a (13S) intron [15],[69] IVS 2250 3211 E1b (1.31 kb) intron A [69] IVS 2250 3269 E1b (1.26 kb) intron A [69] IVS 2250 3588 E1b (13S) intron A [25] IVS 3505 3588 E1b (13S) intron A' [25] IVS 3505 3588 E1b (22S) intron [25] IVS 3505 3588 E1b (1.31 kb) intron B [25] IVS 3505 3588 E1b (1.26 kb) intron B [25] IVS 6080 7100 major late mRNA intron (precedes 2nd leader) [10],[11],[18],[19] IVS 7173 7941 major late mRNA intron (precedes 'i' leader) [10],[11],[18],[19],[39],[47] IVS 8382 9633 major late mRNA intron (precedes 3rd leader) [10],[11],[18],[19],[39],[44],[47] IVS 9724 11039 major late mRNA intron (precedes 52,55K mRNA; 1st L1 mRNA) [10],[11],[18],[19],[37] IVS 9724 14149 major late mRNA intron (precedes penton mRNA; 1st L2 mRNA) [54] IVS 9724 16515 major late mRNA intron (precedes pV mRNA; 2nd L2 mRNA) [65] IVS 9724 17999 major late mRNA intron (precedes pVI mRNA; 1st L3 mRNA) [36] IVS 9724 18801 major late mRNA intron (precedes hexon mRNA; 2nd L3 mRNA) [6],[10] IVS 9724 21649 major late mRNA intron (precedes 23K mRNA; 3rd L3 mRNA) [28] IVS 9724 24094 major late mRNA intron (precedes 100K mRNA; 1st L4 mRNA) [51] IVS 26552 26753 33K-pept intron [48] IVS 27981 28375 major late mRNA intron ('x' leader) [52],[65] IVS 28560 30437 major late mRNA intron ('y' leader) [19],[36],[44],[52] IVS 30583 31029 major late mRNA intron ('z' leader) [6],[10],[44],[52] IVS 5695 5418 (c) IVa2 intron [25],[69] IVS 35547 35108 (c) E4 mRNA intron A [68] IVS 35547 34736 (c) E4 mRNA intron A' [63] IVS 34605 34436 (c) E4 mRNA intron B [63] IVS 34605 34380 (c) E4 mRNA intron B' [62] IVS 34605 34330 (c) E4 mRNA intron B" [63] IVS 34288 34242 (c) E4 mRNA intron C [63] IVS 34288 34083 (c) E4 mRNA intron C' [63] IVS 33903 33875 (c) E4 mRNA intron D [62] IVS 33903 33679 (c) E4 mRNA intron D1 [62] IVS 33903 33610 (c) E4 mRNA intron D2 [62] IVS 33903 33452 (c) E4 mRNA intron D3 [62] IVS 33903 33404 (c) E4 mRNA intron D4 [62] IVS 33903 33377 (c) E4 mRNA intron D5 [62] IVS 33903 33284 (c) E4 mRNA intron D6 [63] IVS 33903 33193 (c) E4 mRNA intron D7 [62],[63],[68] IVS 27024 24792 (c) E2a early mRNA intron A [13] IVS 25885 24972 (c) E2a late mRNA intron A [51] IVS 24714 24089 (c) E2a mRNA intron B [32] rpt 1 102 inverted terminal repetition; 0.28% [14],[16] rpt 35836 35937 inverted terminal repetition; 99.54% [14],[16] variant 8 8 a in [42],[70]; aa in other strains,e.g.[13] variant 460 460 c in [42],[69]; t in [34] signal 1608 1613 E1a mRNA polyadenylation signal (putative); 4.47% signal 4029 4034 E1b and IX mRNA polyadenylation signal (putative); 11.21% signal 4090 4085 (c) IVa2 mRNA polyadenylation signal on comp strand(putative); 11.36% revision 6443 6443 c in [42],[70]; y in [12] conflict 6574 6575 cc in [42],[70]; c in [12] revision 7212 7213 gg in [42],[70]; g in [19] conflict 9315 9316 cg in [42],[70]; gc in [43] variant 9382 9382 c is shown; can be cttc due to population heterogeneity [66] conflict 9633 9634 gg in [42],[70]; g in [11] conflict 10715 10716 gc in [42],[70]; g in [3] variant 11062 11062 t in [41],[69]; c in [24] variant 14064 14080 15 to 19 A residues have been observed in various populations [66] signal 14092 14097 major late mRNA L1 poly-A signal (putative) 39.21% variant 15856 15856 g in [65],[70]; t in [56] variant 15914 15914 c in [62],[67]; t in [56] variant 15998 15998 g in [64],[69]; c in [56] conflict 16205 16208 ccga in [64],[69]; c in [56] variant 16437 16437 g in [64],[69]; c in [56] signal 17949 17954 major late mRNA L2 polyadenyation signal (putative) 49.94% variant 17964 17964 g in [65],[70]; c in [55] used revision 18914 18915 cc in [61],[67]; c in [6],[34],[36] used revision 18919 18919 c in [64],[70]; nn in [34] revision 19617 19617 t in [64],[70]; c in [34] revision 19666 19666 t in [64],[70]; c in [34] revision 19823 19823 a in [64],[70]; g in [34] revision 20427 20427 a in [64],[70]; g in [34] revision 20487 20487 c in [64],[70]; t in [34] signal 22418 22423 major late mRNA L3 polyadenyation signal (putative); 62.38% signal 22444 22439 (c) E2a mRNA polyadenyation signal on comp strand (putative); 62.43% variant 22524 22524 t in [40],[70]; c in [22] signal 28205 28210 major late mRNA L4 polyadenyation signal; (putative) 78.48% revision 28339 28339 g in [20],[70]; gc in [19] revision 28350 28350 g in [20],[70]; ga in [19] revision 28359 28359 t in [20],[70]; ta in [19] revision 28465 28466 cc in [20],[70]; c in [11],[19] revision 28495 28497 ttg in [20],[70]; t in [11],[19] signal 29769 29774 E3-1 mRNA polyadenylation signal (putative); 82.69% signal 30842 30847 E3-2 mRNA polyadenyation signal; 85.82% (putative) revision 30980 30981 at in [29],[70]; a in [19] signal 32774 32779 major late mRNA L5 polyadenyation signal; (putative) 91.19% signal 32826 32821 (c) E4 mRNA polyadenyation signal on comp strand; 91.32% (putative) variant 34344 34345 tt in [42],[70]; t in [31] variant 35930 35930 t in [42],[70]; tt in other strains BASE COUNT 8342 a 10045 c 9793 g 7757 t ORIGIN 5' end of the l-strand of the genome. 1 catcatcata atatacctta ttttggattg aagccaatat gataatgagg gggtggagtt 61 tgtgacgtgg cgcggggcgt gggaacgggg cgggtgacgt agtagtgtgg cggaagtgtg 121 atgttgcaag tgtggcggaa cacatgtaag cgccggatgt ggtaaaagtg acgtttttgg 181 tgtgcgccgg tgtatacggg aagtgacaat tttcgcgcgg ttttaggcgg atgttgtagt 241 aaatttgggc gtaaccaagt aatgtttggc cattttcgcg ggaaaactga ataagaggaa 301 gtgaaatctg aataattctg tgttactcat agcgcgtaat atttgtctag ggccgcgggg 361 actttgaccg tttacgtgga gactcgccca ggtgtttttc tcaggtgttt tccgcgttcc 421 gggtcaaagt tggcgtttta ttattatagt cagctgacgc gcagtgtatt tatacccggt 481 gagttcctca agaggccact cttgagtgcc agcgagtaga gttttctcct ccgagccgct 541 ccgacaccgg gactgaaaat gagacatatt atctgccacg gaggtgttat taccgaagaa 601 atggccgcca gtcttttgga ccagctgatc gaagaggtac tggctgataa tcttccacct 661 cctagccatt ttgaaccacc tacccttcac gaactgtatg atttagacgt gacggccccc 721 gaagatccca acgaggaggc ggtttcgcag atttttcccg agtctgtaat gttggcggtg 781 caggaaggga ttgacttatt cacttttccg ccggcgcccg gttctccgga gccgcctcac 841 ctttcccggc agcccgagca gccggagcag agagccttgg gtccggtttc tatgccaaac 901 cttgtgccgg aggtgatcga tcttacctgc cacgaggctg gctttccacc cagtgacgac 961 gaggatgaag agggtgagga gtttgtgtta gattatgtgg agcaccccgg gcacggttgc 1021 aggtcttgtc attatcaccg gaggaatacg ggggacccag atattatgtg ttcgctttgc 1081 tatatgagga cctgtggcat gtttgtctac agtaagtgaa aattatgggc agtcggtgat 1141 agagtggtgg gtttggtgtg gtaatttttt tttaattttt acagttttgt ggtttaaaga 1201 attttgtatt gtgatttttt aaaaggtcct gtgtctgaac ctgagcctga gcccgagcca 1261 gaaccggagc ctgcaagacc tacccggcgt cctaaattgg tgcctgctat cctgagacgc 1321 ccgacatcac ctgtgtctag agaatgcaat agtagtacgg atagctgtga ctccggtcct 1381 tctaacacac ctcctgagat acacccggtg gtcccgctgt gccccattaa accagttgcc 1441 gtgagagttg gtgggcgtcg ccaggctgtg gaatgtatcg aggacttgct taacgagtct 1501 gggcaacctt tggacttgag ctgtaaacgc cccaggccat aaggtgtaaa cctgtgattg 1561 cgtgtgtggt taacgccttt gtttgctgaa tgagttgatg taagtttaat aaagggtgag 1621 ataatgttta acttgcatgg cgtgttaaat ggggcggggc ttaaagggta tataatgcgc 1681 cgtgggctaa tcttggttac atctgacctc atggaggctt gggagtgttt ggaagatttt 1741 tctgctgtgc gtaacttgct ggaacagagc tctaacagta cctcttggtt ttggaggttt 1801 ctgtggggct cctcccaggc aaagttagtc tgcagaatta aggaggatta caagtgggaa 1861 tttgaagagc ttttgaaatc ctgtggtgag ctgtttgatt ctttgaatct gggtcaccag 1921 gcgcttttcc aagagaaggt catcaagact ttggattttt ccacaccggg gcgcgctgcg 1981 gctgctgttg cttttttgag ttttataaag gataaatgga gcgaagaaac ccatctgagc 2041 ggggggtacc tgctggattt tctggccatg catctgtgga gagcggtggt gagacacaag 2101 aatcgcctgc tactgttgtc ttccgtccgc ccggcaataa taccgacgga ggagcaacag 2161 caggaggaag ccaggcggcg gcggcggcag gagcagagcc catggaaccc gagagccggc 2221 ctggaccctc gggaatgaat gttgtacagg tggctgaact gtttccagaa ctgagacgca 2281 ttttaaccat taacgaggat gggcaggggc taaagggggt aaagagggag cggggggctt 2341 ctgaggctac agaggaggct aggaatctaa cttttagctt aatgaccaga caccgtcctg 2401 agtgtgttac ttttcagcag attaaggata attgcgctaa tgagcttgat ctgctggcgc 2461 agaagtattc catagagcag ctgaccactt actggctgca gccaggggat gattttgagg 2521 aggctattag ggtatatgca aaggtggcac ttaggccaga ttgcaagtac aagattagca 2581 aacttgtaaa tatcaggaat tgttgctaca tttctgggaa cggggccgag gtggagatag 2641 atacggagga tagggtggcc tttagatgta gcatgataaa tatgtggccg ggggtgcttg 2701 gcatggacgg ggtggttatt atgaatgtga ggtttactgg tcccaatttt agcggtacgg 2761 ttttcctggc caataccaat cttatcctac acggtgtaag cttctatggg tttaacaata 2821 cctgtgtgga agcctggacc gatgtaaggg ttcggggctg tgccttttac tgctgctgga 2881 agggggtggt gtgtcgcccc aaaagcaggg cttcaattaa gaaatgcctg tttgaaaggt 2941 gtaccttggg tatcctgtct gagggtaact ccagggtgcg ccacaatgtg gcctccgact 3001 gtggttgctt catgctagtg aaaagcgtgg ctgtgattaa gcataacatg gtgtgtggca 3061 actgcgagga cagggcctct cagatgctga cctgctcgga cggcaactgt cacttgctga 3121 agaccattca cgtagccagc cactctcgca aggcctggcc agtgtttgag cacaacatac 3181 tgacccgctg ttccttgcat ttgggtaaca ggaggggggt gttcctacct taccaatgca 3241 atttgagtca cactaagata ttgcttgagc ccgagagcat gtccaaggtg aacctgaacg 3301 gggtgtttga catgaccatg aagatctgga aggtgctgag gtacgatgag acccgcacca 3361 ggtgcagacc ctgcgagtgt ggcggtaaac atattaggaa ccagcctgtg atgctggatg 3421 tgaccgagga gctgaggccc gatcacttgg tgctggcctg cacccgcgct gagtttggct 3481 ctagcgatga agatacagat tgaggtactg aaatgtgtgg gcgtggctta agggtgggaa 3541 agaatatata aggtgggggt ctcatgtagt tttgtatctg ttttgcagca gccgccgcca 3601 tgagcgccaa ctcgtttgat ggaagcattg tgagctcata tttgacaacg cgcatgcccc 3661 catgggccgg ggtgcgtcag aatgtgatgg gctccagcat tgatggtcgc cccgtcctgc 3721 ccgcaaactc tactaccttg acctacgaga ccgtgtctgg aacgccgttg gagactgcag 3781 cctccgccgc cgcttcagcc gctgcagcca ccgcccgcgg gattgtgact gactttgctt 3841 tcctgagccc gcttgcaagc agtgcagctt cccgttcatc cgcccgcgat gacaagttga 3901 cggctctttt ggcacaattg gattctttga cccgggaact taatgtcgtt tctcagcagc 3961 tgttggatct gcgccagcag gtttctgccc tgaaggcttc ctcccctccc aatgcggttt 4021 aaaacataaa taaaaaccag actctgtttg gattttgatc aagcaagtgt cttgctgtct 4081 ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt cgttgagggt 4141 cctgtgtatt ttttccagga cgtggtaaag gtgactctgg atgttcagat acatgggcat 4201 aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg gggtggtgtt 4261 gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt ctttcagtag 4321 caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt taagctggga 4381 tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt tggctatgtt 4441 cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag tgtatccggt 4501 gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact tggagacgcc 4561 cttgtgacct ccgagatttt ccatgcattc gtccataatg atggcaatgg gcccacgggc 4621 ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt ccaggatgag 4681 atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg gtataatggt 4741 tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg ctttgagttc 4801 agatgggggg atcatgtcta cctgcggggc gatgaagaaa accgtttccg gggtagggga 4861 gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc cggtgggccc 4921 gtaaatcaca cctattaccg gctgcaactg gtagttaaga gagctgcagc tgccgtcatc 4981 cctgagcagg ggggccactt cgttaagcat gtccctgact tgcatgtttt ccctgaccaa 5041 atgcgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag caaagttttt 5101 caacggtttg aggccgtccg ccgtaggcat gcttttgagc gtttgaccaa gcagttccag 5161 gcggtcccac agctcggtca cgtgctctac ggcatctcga tccagcatat ctcctcgttt 5221 cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag acgggccagg 5281 gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac ggtgaagggg 5341 tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct ggtgctgaag 5401 cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt gtcatagtcc 5461 agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc gccgcacgag 5521 gggcagtgca gacttttaag ggcgtagagc ttgggcgcga gaaataccga ttccggggag 5581 taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca ggtgagctct 5641 ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt cttacctctg 5701 gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc cccgtataca 5761 gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag aaactcggac 5821 cactctgaga cgaaggctcg cgtccaggcc agcacgaagg aggctaagtg ggaggggtag 5881 cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat gtcgccctct 5941 tcggcatcaa ggaaggtgat tggtttatag gtgtaggcca cgtgaccggg tgttcctgaa 6001 ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc atcgctgtct 6061 gcgagggcca gctgttgggg tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta 6121 agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc ggtgatgcct 6181 ttgagggtgg ccgcgtccat ctggtcagaa aagacaatct ttttgttgtc aagcttggtg 6241 gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag ggtttggttt 6301 ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc gcgcgcaacg 6361 caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac gcgccaaccg 6421 cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag gcgctcgttg 6481 gtccagcaga ggcggccgcc cttgcgcgaa cagaatggcg gtagtgggtc tagctgcgtc 6541 tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc gtcgaagtag 6601 tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc aagcgcgcgc 6661 tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga ggcgtacatg 6721 ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt agggtagcat 6781 cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg agcgaggagg 6841 tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg cctgaagatg 6901 gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc gtctgtgaga 6961 cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac cagctcggcg 7021 gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc atacttatcc 7081 tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc tttccagtac 7141 tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta gaactggttg 7201 acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg cgcggccttc 7261 cggagcgagg tgtgggtgag cgcaaaggtg tccctaacca tgactttgag gtactggtat 7321 ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt gcgctttttg 7381 gaacgcgggt ttggcagggc gaaggtgaca tcgttgaaaa gtatctttcc cgcgcgaggc 7441 ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt aattacctgg 7501 gcggcgagca cgatctcgtc gaagccgttg atgttgtggc ccacgatgta aagttccaag 7561 aagcgcgggg tgcccttgat ggagggcaat tttttaagtt cctcgtaggt gagctcctca 7621 ggggagctga gcccgtgttc tgacagggcc cagtctgcaa gatgagggtt ggaagcgacg 7681 aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa ggtcctaaac 7741 tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg gtcttgttcc 7801 cagcggtccc atccaaggtc cacggctagg tctcgcgcgg cggtcaccag aggctcatct 7861 ccgccgaact tcataaccag catgaagggc acgagctgct tcccaaaggc ccccatccaa 7921 gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg cgagccgatc 7981 gggaagaact ggatctcccg ccaccagttg gaggagtggc tgttgatgtg gtgaaagtag 8041 aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc gcagtactgg 8101 cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg cacaaggaag 8161 cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc tacttcggct 8221 gcttgtcctt gaccgtctgg ctgctcgagg ggagttatgg tggatcggac caccacgccg 8281 cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac aacatcgcgc 8341 agatgggagc tgtccatggt ctggagctcc cgcggcgaca ggtcaggcgg gagctcctgc 8401 aggtttacct cgcatagccg ggtcagggcg cgggctaggt ccaggtgata cctgatttcc 8461 aggggctggt tggtggcggc gtcgatgact tgcaagaggc cgcatccccg cggcgcgact 8521 acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc atctaaaagc 8581 ggtgacgcgg gcgggccccc ggaggtaggg ggggctcggg acccgccggg agagggggca 8641 ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcggagg ttgctggcga 8701 acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag acgacgggcc 8761 cggtgagctt gaacctgaaa gagagttcga cagaatcaat ttcggtgtcg ttgacggcgg 8821 cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatt tcggccatga 8881 actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg gtggcggcga 8941 ggtcgttgga gatgcgggcc atgagctgcg agaaggcgtt gaggcctccc tcgttccaga 9001 cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc tgcgcgagat 9061 tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag aggtagttga 9121 gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc aacgtggatt 9181 cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc acggcgaagt 9241 tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga cggatgagct 9301 cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct tcttcaatct 9361 cctcttccat aagggcctcc ccttcttctt cttcttctgg cggcggtggg ggagggggga 9421 cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc atctccccgc 9481 ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc agttggaaga 9541 cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccgtgcggc agggatacgg 9601 cgctaacgat gcatctcaac aattgttgtg taggtactcc gccaccgagg gacctgagcg 9661 agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag tcacagtcgc 9721 aaggtaggct gagcaccgtg gcgggcggca gcgggtggcg gtcggggttg tttctggcgg 9781 aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg gtcgacagaa 9841 gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg ccccaggctt 9901 cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct accggcactt 9961 cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctacggcg gcggcggagt 10021 ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc ctcatcggct 10081 gaagcagggc caggtcggcg acaacgcgct cggctaatat ggcctgctgc acctgcgtga 10141 gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg ttgatggtgt 10201 aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc gagagctcgg 10261 tgtacctgag acgcgagtaa gcccttgagt caaagacgta gtcgttgcaa gtccgcacca 10321 ggtactgata tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc cagcgtaggg 10381 tggccggggc tccgggggcg aggtcttcca acataaggcg atgatatccg tagatgtacc 10441 tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg cggacgcggt 10501 tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg ccggtgaggc 10561 gtgcgcagtc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg ggcactcttc 10621 cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt tcgaaccccg 10681 gatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc caggtgtgcg 10741 acgtcagaca acgggggagc gctccttttg gcttccttcc aggcgcggcg gctgctgcgc 10801 tagctttttt ggccactggc cgcgcgcggc gtaagcggtt aggctggaaa gcgaaagcat 10861 taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc gcaggacccc 10921 cggttcgagt ctcgggccgg ccggactgcg gcgaacgggg gtttgcctcc ccgtcatgca 10981 agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc ttttcccaga 11041 tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag caagagcagc 11101 ggcagacatg cagggcaccc tccccttctc ctaccgcgtc aggaggggca acatccgcgg 11161 ctgacgcggc ggcagatggt gattacgaac ccccgcggcg ccgggcccgg cactacctgg 11221 acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag cgacacccaa 11281 gggtgcagct gaagcgtgac acgcgcgagg cgtacgtgcc gcggcagaac ctgtttcgcg 11341 accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca gggcgcgagt 11401 tgcggcatgg cctgaaccgc gagcggttgc tgcgcgagga ggactttgag cccgacgcgc 11461 ggaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta accgcgtacg 11521 agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac gtgcgcacgc 11581 ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt gtaagcgcgc 11641 tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata gtgcagcaca 11701 gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc gagggccgct 11761 ggctgctcga tttgataaac attctgcaga gcatagtggt gcaggagcgc agcttgagcc 11821 tggctgacaa ggtggccgcc attaactatt ccatgctcag tctgggcaag ttttacgccc 11881 gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc gaggggttct 11941 acatgcgcat ggcgttgaag gtgcttacct tgagcgacga cctgggcgtt tatcgcaacg 12001 agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac cgcgagctga 12061 tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag gccgagtcct 12121 actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg gaggcagctg 12181 gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc ggcgtggagg 12241 aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg gtgatgtttc 12301 tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc agagccagcc 12361 gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca tgtcgctgac 12421 tgcgcgtaac cctgacgcgt tccggcagca gccgcaggcc aaccggctct ccgcaattct 12481 ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg cgatcgtaaa 12541 cgcgctggcc gaaaacaggg ccatccggcc cgatgaggcc ggcctggtct acgacgcgct 12601 gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg accggctggt 12661 gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg gcaacctggg 12721 ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc cgcggggaca 12781 ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga caccgcaaag 12841 tgaggtgtac cagtccgggc cagactattt tttccagacc agtagacaag gcctgcagac 12901 cgtaaacctg agccaggctt tcaagaactt gcaggggctg tggggggtgc gggctcccac 12961 aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt tgctgctgct 13021 aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag gtcacttgct 13081 gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt tccaggagat 13141 tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg caaccctgaa 13201 ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa acagcgagga 13261 ggagcgcatc ttgcgctatg tgcagcagag cgtgagcctt aacctgatgc gcgacggggt 13321 aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca tgtatgcctc 13381 aaaccggccg tttatcaatc gcctaatgga ctacttgcat cgcgcggccg ccgtgaaccc 13441 cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg gtttctacac 13501 cgggggattt gaggtgcccg agggtaacga tggattcctc tgggacgaca tagacgacag 13561 cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc aggcagaggc 13621 ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag gcgctgcggc 13681 cccgcggtca gatgcgagta gcccatttcc aagcttgata gggtctttta ccagcactcg 13741 caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc tgctgcagcc 13801 gcagcgcgaa aagaacctgc ctccggcatt tcccaacaac gggatagaga gcctagtgga 13861 caagatgagt agatggaaga cgtatgcgca ggagcacagg gatgtgcccg gcccgcgccc 13921 gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg acgatgactc 13981 ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg cgcaccttcg 14041 ccccaggctg gggagaatgt tttaaaaaaa aaaaaaaaaa gcatgatgca aaataaaaaa 14101 ctcaccaagg ccatggcacc gagcgttggt tttcttgtat tccccttagt atgcagcgcg 14161 cggcgatgta tgaggaaggt cctcctccct cctacgagag cgtggtgagc gcggcgccag 14221 tggcggcggc gctgggttcc cccttcgatg ctcccctgga cccgccgttt gtgcctccgc 14281 ggtacctgcg gcctaccggg gggagaaaca gcatccgtta ctctgagttg gcacccctat 14341 tcgacaccac ccgtgtgtac cttgtggaca acaagtcaac ggatgtggca tccctgaact 14401 accagaacga ccacagcaac tttctaacca cggtcattca aaacaatgac tacagcccgg 14461 gggaggcaag cacacagacc atcaatcttg acgaccgttc gcactggggc ggcgacctga 14521 aaaccatcct gcataccaac atgccaaatg tgaacgagtt catgtttacc aataagttta 14581 aggcgcgggt gatggtgtcg cgctcgctta ctaaggacaa acaggtggag ctgaaatatg 14641 agtgggtgga gttcacgctg cccgagggca actactccga gaccatgacc atagacctta 14701 tgaacaacgc gatcgtggag cactacttga aagtgggcag gcagaacggg gttctggaaa 14761 gcgacatcgg ggtaaagttt gacacccgca acttcagact ggggtttgac ccagtcactg 14821 gtcttgtcat gcctggggta tatacaaacg aagccttcca tccagacatc attttgctgc 14881 caggatgcgg ggtggacttc acccacagcc gcctgagcaa cttgttgggc atccgcaagc 14941 ggcaaccctt ccaggagggc tttaggatca cctacgatga cctggagggt ggtaacattc 15001 ccgcactgtt ggatgtggac gcctaccagg caagcttaaa agatgacacc gaacagggcg 15061 gggatggcgc aggcggcggc aacaacagtg gcagcggcgc ggaagagaac tccaacgcgg 15121 cagccgcggc aatgcagccg gtggaggaca tgaacgatca tgccattcgc ggcgacacct 15181 ttgccacacg ggcggaggag aagcgcgctg aggccgaggc agcggcagaa gctgccgccc 15241 ccgctgcgca acccgaggtc gagaagcctc agaagaaacc ggtgatcaaa cccctgacag 15301 aggacagcaa gaaacgcagt tacaacctaa taagcaatga cagcaccttc acccagtacc 15361 gcagctggta ccttgcatac aactacggcg accctcagac cgggatccgc tcatggaccc 15421 tcctttgcac tcctgacgta acctgcggct cggagcaggt ctactggtcg ttgccagaca 15481 tgatgcaaga ccccgtgacc ttccgctcca cgagccagat cagcaacttt ccggtggtgg 15541 gcgccgagct gttgcccgtg cactccaaga gcttctacaa cgaccaggcc gtctactccc 15601 agctcatccg ccagtttacc tctctgaccc acgtgttcaa tcgctttccc gagaaccaga 15661 ttttggcgcg cccgccagcc cccaccatca ccaccgtcag tgaaaacgtt cctgctctca 15721 cagatcacgg gacgctaccg ctgcgcaaca gcatcggagg agtccagcga gtgaccatta 15781 ctgacgccag acgccgcacc tgcccctacg tttacaaggc cctgggcata gtctcgccgc 15841 gcgtcctatc gagccgcact ttttgagcaa acatgtccat ccttatatcg cccagcaata 15901 acacaggctg gggcctgcgc ttcccaagca agatgtttgg cggggcaaag aagcgctccg 15961 accaacaccc agtgcgcgtg cgcgggcact accgcgcgcc ctggggcgcg cacaaacgcg 16021 gccgcactgg gcgcaccacc gtcgatgacg ccattgacgc ggtggtggag gaggcgcgca 16081 actacacgcc cacgccgcca ccagtgtcca cagtggacgc ggccattcag accgtggtgc 16141 gcggagcccg gcgttatgct aaaatgaaga gacggcggag gcgcgtagca cgtcgccacc 16201 gccgccgacc cggcactgcc gcccaacgcg cggcggcggc cctgcttaac cgcgcacgtc 16261 gcaccggccg acgggcggcc atgcgggccg ctcgaaggct ggccgcgggt attgtcactg 16321 tgccccccag gtccaggcga cgagcggccg ccgcagcagc cgcggccatt agtgctatga 16381 ctcagggtcg caggggcaac gtgtactggg tgcgcgactc ggttagcggc ctgcgcgtgc 16441 ccgtgcgcac ccgccccccg cgcaactaga ttgcaagaaa aaactactta gactcgtact 16501 gttgtatgta tccagcggcg gcggcgcgca acgaagctat gtccaagcgc aaaatcaaag 16561 aagagatgct ccaggtcatc gcgccggaga tctatggccc cccgaagaag gaagagcagg 16621 attacaagcc ccgaaagcta aagcgggtca aaaagaaaaa gaaagatgat gatgatgatg 16681 aacttgacga cgaggtggaa ctgctgcacg caaccgcgcc caggcggcgg gtacagtgga 16741 aaggtcgacg cgtaagacgt gttttgcgac ccggcaccac cgtagttttt acgcccggtg 16801 agcgctccac ccgcacctac aagcgcgtgt atgatgaggt gtacggcgac gaggacctgc 16861 ttgagcaggc caacgagcgc ctcggggagt ttgcctacgg aaagcggcat aaggacatgt 16921 tggcgttgcc gctggacgag ggcaacccaa cacctagcct aaagcccgtg acactgcagc 16981 aggtgctgcc cacgcttgca ccgtccgaag aaaagcgcgg cctaaagcgc gagtctggtg 17041 acttggcacc caccgtgcag ctgatggtac ccaagcgcca gcgactggaa gatgtcttgg 17101 aaaaaatgac cgtggagcct gggctggagc ccgaggtccg cgtgcggcca atcaagcagg 17161 tggcaccggg actgggcgtg cagaccgtgg acgttcagat acccaccacc agtagcacta 17221 gtattgccac tgccacagag ggcatggaga cacaaacgtc cccggttgcc tcggcggtgg 17281 cagatgccgc ggtgcaggcg gccgctgcgg ccgcgtccaa aacctctacg gaggtgcaaa 17341 cggacccgtg gatgtttcgc gtttcagccc cccggcgccc gcgccgttcc aggaagtacg 17401 gcaccgccag cgcactactg cccgaatatg ccctacatcc ttccatcgcg cctacccccg 17461 gctatcgtgg ctacacctac cgccccagaa gacgagcgac tacccgacgc cgaaccacca 17521 ctggaacccg ccgccgccgt cgccgtcgcc agcccgtgct ggccccgatt tccgtgcgca 17581 gggtggctcg cgaaggaggc aggaccctgg tgctgccaac agcgcgctac caccccagca 17641 tcgtttaaaa gccggtcttt gtggttcttg cagatatggc cctcacctgc cgcctccgtt 17701 tcccggtgcc gggattccga ggaagaatgc accgtaggag gggcatggcc ggccacggcc 17761 tgacgggcgg catgcgtcgt gcgcaccacc ggcggcggcg cgcgtcgcac cgtcgcatgc 17821 gcggcggtat cctgcccctc cttattccac tgatcgccgc ggcgattggc gccgtgcccg 17881 gaattgcatc cgtggccttg caggcgcaga gacactgatt aaaaacaagt tgcatgtgga 17941 aaaatcaaaa taaaaagtct ggagtctcac gctcgcttgg tcctgtaact attttgtaga 18001 atggaagaca tcaactttgc gtctctggcc ccgcgacacg gctcgcgccc gttcatggga 18061 aactggcaag atatcggcac cagcaatatg agcggtggcg ccttcagctg gggctcgctg 18121 tggagcggca ttaaaaattt cggttccacc attaagaact atggcagcaa ggcctggaac 18181 agcagcacag gccagatgct gagggacaag ttgaaagagc aaaatttcca acaaaaggtg 18241 gtagatggcc tggcctctgg cattagcggg gtggtggacc tggccaacca ggcagtgcaa 18301 aataagatta acagtaagct tgatccccgc cctcccgtag aggagcctcc accggccgtg 18361 gagacagtgt ctccagaggg gcgtggcgaa aagcgtccgc ggcccgacag ggaagaaact 18421 ctggtgacgc aaatagatga gcctccctcg tacgaggagg cactaaagca aggcctgccc 18481 accacccgtc ccatcgcgcc catggctacc ggagtgctgg gccagcacac acctgtaacg 18541 ctggacctgc ctccccccgc tgacacccag cagaaacctg tgctgccagg gccgtccgcc 18601 gttgttgtaa cccgccctag ccgcgcgtcc ctgcgccgtg ccgccagcgg tccgcgatcg 18661 atgcggcccg tagccagtgg caactggcaa agcacactga acagcatcgt gggtctgggg 18721 gtgcaatccc tgaagcgccg acgatgcttc taaatagcta acgtgtcgta tgtgtcatgt 18781 atgcgtccat gtcgccgcca gaggagctgc tgagccgccg tgcgcccgct ttccaagatg 18841 gctacccctt cgatgatgcc gcagtggtct tacatgcaca tctcgggcca ggacgcctcg 18901 gagtacctga gccccgggct ggtgcagttt gcccgcgcca ccgagacgta cttcagcctg 18961 aataacaagt ttagaaaccc cacggtggca cctacgcacg acgtaaccac agaccggtcc 19021 cagcgtttga cgctgcggtt catccctgtg gaccgcgagg ataccgcgta ctcgtacaaa 19081 gcgcggttca ccctggctgt gggtgacaac cgtgtgcttg atatggcttc cacgtacttt 19141 gacatccgcg gcgtgctgga cagggggcct acttttaagc cctactccgg cactgcctac 19201 aacgctctag ctcccaaggg cgctcctaac tcctgtgagt gggaacaaac cgaagatagc 19261 ggccgggcag ttgccgagga tgaagaagag gaagatgaag atgaagaaga ggaagaagaa 19321 gagcaaaacg ctcgagatca ggctactaag aaaacacatg tctatgccca ggctcctttg 19381 tctggagaaa caattacaaa aagcgggcta caaataggat cagacaatgc agaaacacaa 19441 gctaaacctg tatacgcaga tccttcctat caaccagaac ctcaaattgg cgaatctcag 19501 tggaacgaag ctgatgctaa tgcggcagga gggagagtgc ttaaaaaaac aactcccatg 19561 aaaccatgct atggatctta tgccaggcct acaaatcctt ttggtggtca atccgttctg 19621 gttccggatg aaaaaggggt gcctcttcca aaggttgact tgcaattctt ctcaaatact 19681 acctctttga acgaccggca aggcaatgct actaaaccaa aagtggtttt gtacagtgaa 19741 gatgtaaata tggaaacccc agacacacat ctgtcttaca aacctggaaa aggtgatgaa 19801 aattctaaag ctatgttggg tcaacaatct atgccaaaca gacccaatta cattgctttc 19861 agggacaatt ttattggcct aatgtattat aacagcactg gcaacatggg tgttcttgct 19921 ggtcaggcat cgcagctaaa tgccgtggta gatttgcaag acagaaacac agagctgtcc 19981 tatcaactct tgcttgattc cataggtgat agaaccagat atttttctat gtggaatcag 20041 gctgtagaca gctatgatcc agatgttaga atcattgaaa accatggaac tgaggatgaa 20101 ttgccaaatt attgttttcc tcttgggggt attggggtaa ctgacaccta tcaagctatt 20161 aaggctaatg gcaatggctc aggcgataat ggagatacta catggacaaa agatgaaact 20221 tttgcaacac gtaatgaaat aggagtgggt aacaactttg ccatggaaat taacctaaat 20281 gccaacctat ggagaaattt cctttactcc aatattgcgc tgtacctgcc agacaagcta 20341 aaatacaacc ccaccaatgt ggaaatatct gacaacccca acacctacga ctacatgaac 20401 aagcgagtgg tggctcccgg gcttgtagac tgctacatta accttggggc gcgctggtct 20461 ctggactaca tggacaacgt taatcccttt aaccaccacc gcaatgcggg cctccgttat 20521 cgctccatgt tgttgggaaa cggccgctac gtgccctttc acattcaggt gccccaaaag 20581 ttttttgcca ttaaaaacct cctcctcctg ccaggctcat atacatatga atggaacttc 20641 aggaaggatg ttaacatggt tctgcagagc tctctgggaa acgatcttag agttgacggg 20701 gctagcatta agtttgacag catttgtctt tacgccacct tcttccccat ggcccacaac 20761 acggcctcca cgctggaagc catgctcaga aatgacacca acgaccagtc ctttaatgac 20821 tacctttccg ccgccaacat gctatacccc atacccgcca acgccaccaa cgtgcccatc 20881 tccatcccat cgcgcaactg ggcagcattt cgcggttggg ccttcacacg cttgaagaca 20941 aaggaaaccc cttccctggg atcaggctac gacccttact acacctactc tggctccata 21001 ccataccttg acggaacctt ctatcttaat cacaccttta agaaggtggc cattaccttt 21061 gactcttctg ttagctggcc gggcaacgac cgcctgctta ctcccaatga gtttgagatt 21121 aaacgctcag ttgacgggga gggctacaac gtagctcagt gcaacatgac caaggactgg 21181 ttcctggtgc agatgttggc caactacaat attggctacc agggcttcta cattccagaa 21241 agctacaagg accgcatgta ctcgttcttc agaaacttcc agcccatgag ccggcaagtg 21301 gttgacgata ctaaatacaa ggagtatcag caggttggaa ttcttcacca gcataacaac 21361 tcaggattcg taggctacct cgctcccacc atgcgcgagg gacaggctta ccccgccaac 21421 gtgccctacc cactaatagg caaaaccgcg gttgacagta ttacccagaa aaagtttctt 21481 tgcgatcgca ccctttggcg catcccattc tccagtaact ttatgtccat gggcgcactc 21541 acagacctgg gccaaaacct tctctacgcc aactccgccc acgcgctaga catgactttt 21601 gaggtggatc ccatggacga gcccaccctt ctttatgttt tgtttgaagt ctttgacgtg 21661 gtccgtgtgc accagccgca ccgcggcgtc atcgagaccg tgtacctgcg cacgcccttc 21721 tcggccggca acgccacaac ataaaagaag caagcaacat caacaacagc tgccgccatg 21781 ggctccagtg agcaggaact gaaagccatt gtcaaagatc ttggttgtgg gccatatttt 21841 ttgggcacct atgacaagcg ctttccaggc tttgtttctc cacacaagct cgcctgcgcc 21901 atagtcaata cggccggtcg cgagactggg ggcgtacact ggatggcctt tgcctggaac 21961 ccgcgctcaa aaacatgcta cctctttgag ccctttggct tttctgacca acgactcaag 22021 caggtttacc agtttgagta cgagtcactc ctgcgccgta gcgccattgc ttcttccccc 22081 gaccgctgta taacgctgga aaagtccacc caaagcgtgc aggggcccaa ctcggccgcc 22141 tgtggactat tctgctgcat gtttctccac gcctttgcca actggcccca aactcccatg 22201 gatcacaacc ccaccatgaa ccttattacc ggggtaccca actccatgct taacagtccc 22261 caggtacagc ccaccctgcg tcgcaaccag gaacagctct acagcttcct ggagcgccac 22321 tcgccctact tccgcagcca cagtgcgcag attaggagcg ccacttcttt ttgtcacttg 22381 aaaaacatgt aaaaataatg tactaggaga cactttcaat aaaggcaaat gtttttattt 22441 gtacactctc gggtgattat ttacccccca cccttgccgt ctgcgccgtt taaaaatcaa 22501 aggggttctg ccgcgcatcg ctatgcgcca ctggcaggga cacgttgcga tactggtgtt 22561 tagtgctcca cttaaactca ggcacaacca tccgcggcag ctcggtgaag ttttcactcc 22621 acaggctgcg caccatcacc aacgcgttta gcaggtcggg cgccgatatc ttgaagtcgc 22681 agttggggcc tccgccctgc gcgcgcgagt tgcgatacac agggttgcag cactggaaca 22741 ctatcagcgc cgggtggtgc acgctggcca gcacgctctt gtcggagatc agatccgcgt 22801 ccaggtcctc cgcgttgctc agggcgaacg gagtcaactt tggtagctgc cttcccaaaa 22861 agggtgcatg cccaggcttt gagttgcact cgcaccgtag tggcatcaga aggtgaccgt 22921 gcccggtctg ggcgttagga tacagcgcct gcatgaaagc cttgatctgc ttaaaagcca 22981 cctgagcctt tgcgccttca gagaagaaca tgccgcaaga cttgccggaa aactgattgg 23041 ccggacaggc cgcgtcatgc acgcagcacc ttgcgtcggt gttggagatc tgcaccacat 23101 ttcggcccca ccggttcttc acgatcttgg ccttgctaga ctgctccttc agcgcgcgct 23161 gcccgttttc gctcgtcaca tccatttcaa tcacgtgctc cttatttatc ataatgctcc 23221 cgtgtagaca cttaagctcg ccttcgatct cagcgcagcg gtgcagccac aacgcgcagc 23281 ccgtgggctc gtggtgcttg taggttacct ctgcaaacga ctgcaggtac gcctgcagga 23341 atcgccccat catcgtcaca aaggtcttgt tgctggtgaa ggtcagctgc aacccgcggt 23401 gctcctcgtt tagccaggtc ttgcatacgg ccgccagagc ttccacttgg tcaggcagta 23461 gcttgaagtt tgcctttaga tcgttatcca cgtggtactt gtccatcaac gcgcgcgcag 23521 cctccatgcc cttctcccac gcagacacga tcggcaggct cagcgggttt atcaccgtgc 23581 tttcactttc cgcttcactg gactcttcct tttcctcttg cgtccgcata ccccgcgcca 23641 ctgggtcgtc ttcattcagc cgccgcaccg tgcgcttacc tcccttgccg tgcttgatta 23701 gcaccggtgg gttgctgaaa cccaccattt gtagcgccac atcttctctt tcttcctcgc 23761 tgtccacgat cacctctggg gatggcgggc gctcgggctt gggagagggg cgcttctttt 23821 tctttttgga cgcaatggcc aaatccgccg tcgaggtcga tggccgcggg ctgggtgtgc 23881 gcggcaccag cgcatcttgt gacgagtctt cttcgtcctc ggactcgaga cgccgcctca 23941 gccgcttttt tgggggcgcg cggggaggcg gcggcgacgg cgacggggac gacacgtcct 24001 ccatggttgg tggacgtcgc gccgcaccgc gtccgcgctc gggggtggtt tcgcgctgct 24061 cctcttcccg actggccatt tccttctcct ataggcagaa aaagatcatg gagtcagtcg 24121 agaaggagga cagcctaacc gccccctttg agttcgccac caccgcctcc accgatgccg 24181 ccaacgcgcc taccaccttc cccgtcgagg cacccccgct tgaggaggag gaagtgatta 24241 tcgagcagga cccaggtttt gtaagcgaag acgacgagga tcgctcagta ccaacagagg 24301 ataaaaagca agaccaggac gacgcagagg caaacgagga acaagtcggg cggggggacc 24361 aaaggcatgg cgactaccta gatgtgggag acgacgtgct gttgaagcat ctgcagcgcc 24421 agtgcgccat tatctgcgac gcgttgcaag agcgcagcga tgtgcccctc gccatagcgg 24481 atgtcagcct tgcctacgaa cgccacctgt tctcaccgcg cgtacccccc aaacgccaag 24541 aaaacggcac atgcgagccc aacccgcgcc tcaacttcta ccccgtattt gccgtgccag 24601 aggtgcttgc cacctatcac atctttttcc aaaactgcaa gataccccta tcctgccgtg 24661 ccaaccgcag ccgagcggac aagcagctgg ccttgcggca gggcgctgtc atacctgata 24721 tcgcctcgct cgacgaagtg ccaaaaatct ttgagggtct tggacgcgac gagaaacgcg 24781 cggcaaacgc tctgcaacaa gaaaacagcg aaaatgaaag tcactgtgga gtgctggtgg 24841 aacttgaggg tgacaacgcg cgcctagccg tgctgaaacg cagcatcgag gtcacccact 24901 ttgcctaccc ggcacttaac ctacccccca aggttatgag cacagtcatg agcgagctga 24961 tcgtgcgccg tgcacgaccc ctggagaggg atgcaaactt gcaagaacaa accgaggagg 25021 gcctacccgc agttggcgat gagcagctgg cgcgctggct tgagacgcgc gagcctgccg 25081 acttggagga gcgacgcaag ctaatgatgg ccgcagtgct tgttaccgtg gagcttgagt 25141 gcatgcagcg gttctttgct gacccggaga tgcagcgcaa gctagaggaa acgttgcact 25201 acacctttcg ccagggctac gtgcgccagg cctgcaaaat ttccaacgtg gagctctgca 25261 acctggtctc ctaccttgga attttgcacg aaaaccgcct cgggcaaaac gtgcttcatt 25321 ccacgctcaa gggcgaggcg cgccgcgact acgtccgcga ctgcgtttac ttatttctgt 25381 gctacacctg gcaaacggcc atgggcgtgt ggcagcaatg cctggaggag cgcaacctaa 25441 aggagctgca gaagctgcta aagcaaaact tgaaggacct atggacggcc ttcaacgagc 25501 gctccgtggc cgcgcacctg gcggacatta tcttccccga acgcctgctt aaaaccctgc 25561 aacagggtct gccagacttc accagtcaaa gcatgttgca aaactttagg aactttatcc 25621 tagagcgttc aggaattctg cccgccacct gctgtgcgct tcctagcgac tttgtgccca 25681 ttaagtaccg tgaatgccct ccgccgcttt ggggtcactg ctaccttctg cagctagcca 25741 actaccttgc ctaccactcc gacatcatgg aagacgtgag cggtgacggc ctactggagt 25801 gtcactgtcg ctgcaaccta tgcaccccgc accgctccct ggtctgcaat tcgcaactgc 25861 ttagcgaaag tcaaattatc ggtacctttg agctgcaggg tccctcgcct gacgaaaagt 25921 ccgcggctcc ggggttgaaa ctcactccgg ggctgtggac gtcggcttac cttcgcaaat 25981 ttgtacctga ggactaccac gcccacgaga ttaggttcta cgaagaccaa tcccgcccgc 26041 caaatgcgga gcttaccgcc tgcgtcatta cccagggcca catccttggc caattgcaag 26101 ccatcaacaa agcccgccaa gagtttctgc tacgaaaggg acggggggtt tacctggacc 26161 cccagtccgg cgaggagctc aacccaatcc ccccgccgcc gcagccctat cagcagccgc 26221 gggcccttgc ttcccaggat ggcacccaaa aagaagctgc agctgccgcc gccgccaccc 26281 acggacgagg aggaatactg ggacagtcag gcagaggagg ttttggacga ggaggaggag 26341 atgatggaag actgggacag cctagacgaa gcttccgagg ccgaagaggt gtcagacgaa 26401 acaccgtcac cctcggtcgc attcccctcg ccggcgcccc agaaattggc aaccgttccc 26461 agcatcgcta caacctccgc tcctcaggcg ccgccggcac tgcctgttcg ccgacccaac 26521 cgtagatggg acaccactgg aaccagggcc ggtaagtcta agcagccgcc gccgttagcc 26581 caagagcaac aacagcgcca aggctaccgc tcgtggcgcg ggcacaagaa cgccatagtt 26641 gcttgcttgc aagactgtgg gggcaacatc tccttcgccc gccgctttct tctctaccat 26701 cacggcgtgg ccttcccccg taacatcctg cattactacc gtcatctcta cagcccctac 26761 tgcaccggcg gcagcggcag cggcagcaac agcagcggtc acacagaagc aaaggcgacc 26821 ggatagcaag actctgacaa agcccaagaa atccacagcg gcggcagcag caggaggagg 26881 agcgctgcgt ctggcgccca acgaacccgt atcgacccgc gagcttagaa ataggatttt 26941 tcccactctg tatgctatat ttcaacaaag caggggccaa gaacaagagc tgaaaataaa 27001 aaacaggtct ctgcgctccc tcacccgcag ctgcctgtat cacaaaagcg aagatcagct 27061 tcggcgcacg ctggaagacg cggaggctct cttcagcaaa tactgcgcgc tgactcttaa 27121 ggactagttt cgcgcccttt ctcaaattta agcgcgaaaa ctacgtcatc tccagcggcc 27181 acacccggcg ccagcacctg tcgtcagcgc cattatgagc aaggaaattc ccacgcccta 27241 catgtggagt taccagccac aaatgggact tgcggctgga gctgcccaag actactcaac 27301 ccgaataaac tacatgagcg cgggacccca catgatatcc cgggtcaacg gaatccgcgc 27361 ccaccgaaac cgaattctcc tcgaacaggc ggctattacc accacacctc gtaataacct 27421 taatccccgt agttggcccg ctgccctggt gtaccaggaa agtcccgctc ccaccactgt 27481 ggtacttccc agagacgccc aggccgaagt tcagatgact aactcagggg cgcagcttgc 27541 gggcggcttt cgtcacaggg tgcggtcgcc cgggcagggt ataactcacc tgaaaatcag 27601 agggcgaggt attcagctca acgacgagtc ggtgagctcc tctcttggtc tccgtccgga 27661 cgggacattt cagatcggcg gcgctggccg ctcttcattt acgccccgtc aggcgatcct 27721 aactctgcag acctcgtcct cggagccgcg ctccggaggc attggaactc tacaatttat 27781 tgaggagttc gtgccttcgg tttacttcaa ccccttttct ggacctcccg gccactaccc 27841 ggaccagttt attcccaact ttgacgcggt gaaagactcg gcggacggct acgactgaat 27901 gaccagtgga gaggcagagc gactgcgcct gacacacctc gaccactgcc gccgccacaa 27961 gtgctttgcc cgcggctccg gtgagttttg ttactttgaa ttgcccgaag agcatatcga 28021 gggcccggcg cacggcgtcc ggctcaccac ccaggtagag cttacacgta gcctgattcg 28081 ggagtttacc aagcgccccc tgctagtgga gcgggagcgg ggtccctgtg ttctgaccgt 28141 ggtttgcaac tgtcctaacc ctggattaca tcaagatctt tgttgtcatc tctgtgctga 28201 gtataataaa tacagaaatt agaatctact ggggctcctg tcgccatcct gtgaacgcca 28261 ccgtttttac ccacccaaag cagaccaaag caaacctcac ctccggtttg cacaagcggg 28321 ccaataagta ccttacctgg tactttaacg gctcttcatt tgtaatttac aacagtttcc 28381 agcgagacga agtaagtttg ccacacaacc ttctcggctt caactacacc gtcaagaaaa 28441 acaccaccac caccaccctc ctcacctgcc gggaacgtac gagtgcgtca ccggttgctg 28501 cgcccacacc tacagcctga gcgtaaccag acattactcc catttttcca aaacaggagg 28561 tgagctcaac tcccggaact caggtcaaaa aagcattttg cggggtgctg ggatttttta 28621 attaagtata tgagcaattc aagtaactct acaagcttgt ctaatttttc tggaattggg 28681 gtcggggtta tccttactct tgtaattctg tttattctta tactagcact tctgtgcctt 28741 agggttgccg cctgctgcac gcacgtttgt acctattgtc agctttttaa acgctggggg 28801 caacatccaa gatgaggtac atgattttag gcttgctcgc ccttgcggca gtctgcagcg 28861 ctgccaaaaa ggttgagttt aaggaaccag cttgcaatgt tacatttaaa tcagaagcta 28921 atgaatgcac tactcttata aaatgcacca cagaacatga aaagcttatt attcgccaca 28981 aagacaaaat tggcaagtat gctgtatatg ctatttggca gccaggtgac actaacgact 29041 ataatgtcac agtcttccaa ggtgaaaatc gtaaaacttt tatgtataaa tttccatttt 29101 atgaaatgtg cgatattacc atgtacatga gcaaacagta caagttgtgg cccccacaaa 29161 agtgtttaga gaacactggc accttttgtt ccaccgctct gcttattaca gcgcttgctt 29221 tggtatgtac cttactttat ctcaaataca aaagcagacg cagttttatt gatgaaaaga 29281 aaatgccttg attttccgct tgcttgtatt cccctggaca atttactcta tgtgggatat 29341 gctccaggcg ggcaagatta tacccacaac cttcaaatca aactttcctg gacgttagcg 29401 cctgatttct gccagcgcct gcactgcaaa tttgatcaaa cccagcttca gcttgcctgc 29461 tccagagatg accggctcaa ccatcgcgcc cacaacggac tatcgcaaca ccactgctac 29521 cggactaaca tctgccctaa atttacccca agttcatgcc tttgtcaatg actgggcgag 29581 cttggacatg tggtggtttt ccatagcgct tatgtttgtt tgccttatta ttatgtggct 29641 tatttgttgc ctaaagcgca gacgcgccag accccccatc tataggccta tcattgtgct 29701 caacccacac aatgaaaaaa ttcatagatt ggacggtctg aaaccatgtt ctcttctttt 29761 acagtatgat taaatgagac atgattcctc gagttcttat attattgacc cttgttgcgc 29821 ttttctgtgc gtgctctaca ttggccgcgg tcgctcacat cgaagtagat tgcatcccac 29881 ctttcacagt ttacctgctt tacggatttg tcacccttat cctcatctgc agcctcgtca 29941 ctgtagtcat cgccttcatt cagttcattg actgggtttg tgtgcgcatt gcgtacctca 30001 ggcaccatcc gcaatacaga gacaggacta tagctgatct tctcagaatt ctttaattat 30061 gaaacggagt gtcatttttg ttttgctgat tttttgcgcc ctacctgtgc tttgctccca 30121 aacctcagcg cctcccaaaa gacatatttc ctgcagattc actcaaatat ggaacattcc 30181 cagctgctac aacaaacaga gcgatttgtc agaagcctgg ttatacgcca tcatctctgt 30241 catggttttt tgcagtacca tttttgccct agccatatat ccataccttg acattggctg 30301 gaatgccata gatgccatga accaccctac tttcccagtg cccgctgtca taccactgca 30361 acaggttatt gccccaatca atcagcctcg ccccccttct cccaccccca ctgagattag 30421 ctactttaat ttgacaggtg gagatgactg aatctctaga tctagaattg gatggaatta 30481 acaccgaaca gcgcctacta gaaaggcgca aggcggcgtc cgagcgagaa cgcctaaaac 30541 aagaagttga agacatggtt aacctacacc agtgtaaaag aggtatcttt tgtgtggtca 30601 agcaggccaa acttacctac gaaaaaacca ctaccggcaa ccgcctcagc tacaagctac 30661 ccacccagcg ccaaaaactg gtgcttatgg tgggagaaaa acctatcacc gtcacccagc 30721 actcggcaga aacagagggc tgcctgcact tcccctatca gggtccagag gacctctgca 30781 ctcttattaa aaccatgtgt ggtattagag atcttattcc attcaactaa cataaacaca 30841 caataaatta cttacttaaa atcagtcagc aaatctttgt ccagcttatt cagcatcacc 30901 tcctttcctt cctcccaact ctggtatctc agccgccttt tagctgcaaa ctttctccaa 30961 agtttaaatg ggatgtcaaa ttcctcatgt tcttgtccct ccgcacccac tatcttcata 31021 ttgttgcaga tgaaacgcgc cagaccgtct gaagacacct tcaaccccgt gtatccatat 31081 gacacagaaa ccgggcctcc aactgtgccc tttcttaccc ctccatttgt ttcacccaat 31141 ggtttccaag aaagtccccc tggagttctc tctctacgcg tctccgaacc tttggacacc 31201 tcccacggca tgcttgcgct taaaatgggc agcggtctta ccctagacaa ggccggaaac 31261 ctcacctccc aaaatgtaac cactgttact cagccactta aaaaaacaaa gtcaaacata 31321 agtttggaca cctccgcacc acttacaatt acctcaggcg ccctaacagt ggcaaccacc 31381 gctcctctga tagttactag cggcgctctt agcgtacagt cacaagcccc actgaccgtg 31441 caagactcca aactaagcat tgctactaaa gggcccatta cagtgtcaga tggaaagcta 31501 gccctgcaaa catcagcccc cctctctggc agtgacagcg acacccttac tgtaactgca 31561 tcacccccgc taactactgc cacgggtagc ttgggcatta acatggaaga tcctatttat 31621 gtaaataatg gaaaaatagg aattaaaata agcggtcctt tgcaagtagc acaaaactcc 31681 gatacactaa cagtagttac tggaccaggt gtcaccgttg aacaaaactc ccttagaacc 31741 aaagttgcag gagctattgg ttatgattca tcaaacaaca tggaaattaa aacgggcggt 31801 ggcatgcgta taaataacaa cttgttaatt ctagatgtgg attacccatt tgatgctcaa 31861 acaaaactac gtcttaaact ggggcaggga cccctgtata ttaatgcatc tcataacttg 31921 gacataaact ataacagagg cctatacctt tttaatgcat caaacaatac taaaaaactg 31981 gaagttagca taaaaaaatc cagtggacta aactttgata atactgccat agctataaat 32041 gcaggaaagg gtctggagtt tgatacaaac acatctgagt ctccagatat caacccaata 32101 aaaactaaaa ttggctctgg cattgattac aatgaaaacg gtgccatgat tactaaactt 32161 ggagcgggtt taagctttga caactcaggg gccattacaa taggaaacaa aaatgatgac 32221 aaacttaccc tgtggacaac cccagaccca tctcctaact gcagaattca ttcagataat 32281 gactgcaaat ttactttggt tcttacaaaa tgtgggagtc aagtactagc tactgtagct 32341 gctttggctg tatctggaga tctttcatcc atgacaggca ccgttgcaag tgttagtata 32401 ttccttagat ttgaccaaaa cggtgttcta atggagaact cctcacttaa aaaacattac 32461 tggaacttta gaaatgggaa ctcaactaat gcaaatccat acacaaatgc agttggattt 32521 atgcctaacc ttctagccta tccaaaaacc caaagtcaaa ctgctaaaaa taacattgtc 32581 agtcaagttt acttgcatgg tgataaaact aaacctatga tacttaccat tacacttaat 32641 ggcactagtg aatccacaga aactagcgag gtaagcactt actctatgtc ttttacatgg 32701 tcctgggaaa gtggaaaata caccactgaa acttttgcta ccaactctta caccttctcc 32761 tacattgccc aggaataaag aatcgtgaac ctgttgcatg ttatgtttca acgtgtttat 32821 ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32881 tagcttatat tgatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32941 acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33001 catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc 33061 caaacgctca tcagtgatat taataaactc cccgggcagc tcgcttaagt tcatgtcgct 33121 gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgctcaacgg gcggcgaagg 33181 ggaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33241 ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33301 ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc atgagacgcc ttgtcctccg 33361 ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33421 aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33481 agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33541 cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33601 tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33661 ctgcccgccg gctatgcact gcagggaacc gggactggaa caatgacagt ggagagccca 33721 ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 33781 cacgtgcata cacttcctca ggattacaag ctcctcccgc gtcagaacca tatcccaggg 33841 aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33901 cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33961 agcgcgggtc tctgtctcaa aaggaggtag gcgatcccta ctgtacggag tgcgccgaga 34021 caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34081 tcctgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgtcgct 34141 tagctcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 34201 tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34261 cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34321 cgggaagagc tggaagaacc atgttttttt tttttttatt ccaaaagatt atccaaaacc 34381 tcaaaatgaa gatctattaa gtgaacgcgc tcccctccgg tggcgtggtc aaactctaca 34441 gccaaagaac agataatggc atttgtaaga tgttgcacaa tggcttccaa aaggcaaact 34501 gccctcacgt ccaagtggac gtaaaggcta aacccttcag ggtgaatctc ctctataaac 34561 attccagcac cttcaaccat gcccaaataa ttttcatctc gccaccttat caatatgtct 34621 ctaagcaaat cccgaatatt aagtccggcc attgtaaaaa tctgctccag agcgccctcc 34681 accttcagcc tcaagcagcg aatcatgatt gcaaaaattc aggttcctca cagacctgta 34741 taagattcaa aagcggaaca ttaacaaaaa taccgcgatc ccgtaggtcc cttcgcaggg 34801 ccagctgaac ataatcgtgc aggtctgcac ggaccagcgc ggccacttcc ccgccaggaa 34861 ccatgacaaa agaacccaca ctgattatga cacgcatact cggagctatg ctaaccagcg 34921 tagcccctat gtaagcttgt tgcatgggcg gcgatataaa atgcaaggtg ctgctcaaaa 34981 aatcaggcaa agcctcgcgc aaaaaagcaa gcacatcgta gtcatgctca tgcagataaa 35041 ggcaggtaag ttccggaacc accacagaaa aagacaccat ttttctctca aacatgtctg 35101 cgggttcctg cattaaacac aaaataaaat aacaaaaaaa aacatttaaa cattagaagc 35161 ctgtcttaca acaggaaaaa caacccttat aagcataaga cggactacgg ccatgccggc 35221 gtgaccgtaa aaaaactggt caccgtgatt aaaaagcacc accgacagtt cctcggtcat 35281 gtccggagtc ataatgtaag actcggtaaa cacatcaggt tggttaacat cggtcagtgc 35341 taaaaagcga ccgaaatagc ccgggggaat acatacccgc aggcgtagag acaacattac 35401 agcccccata ggaggtataa caaaattaat aggagagaaa aacacataaa cacctgaaaa 35461 accctcctgc ctaggcaaaa tagcaccctc ccgctccaga acaacataca gcgcttccac 35521 agcggcagcc ataacagtca gccttaccag taaaaaaacc tattaaaaaa caccactcga 35581 cacggcacca gctcaatcag tcacagtgta aaaagggcca agtacagagc gagtatatat 35641 aggactaaaa aatgacgtaa cggttaaagt ccacaaaaaa cacccagaaa accgcacgcg 35701 aacctacgcc cagaaacgaa agccaaaaaa cccacaactt cctcaaatct tcacttccgt 35761 tttcccacga tacgtcactt cccattttaa aaaaactaca attcccaata catgcaagtt 35821 actccgccct aaaacctacg tcacccgccc cgttcccacg ccccgcgcca cgtcacaaac 35881 tccaccccct cattatcata ttggcttcaa tccaaaataa ggtatattat gatgatg // LOCUS ADBITR 196 bp ds-DNA VRL 15-JUN-1988 DEFINITION Canine adenovirus type 2 5' inverted terminal repeat (ITR). ACCESSION M17111 KEYWORDS inverted terminal repeat. SOURCE Canine adenovirus type 2 (strain Toronto) DNA, passed in canine cell line MDCK. ORGANISM Mastadenovirus c2 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 1 to 196) AUTHORS Shinagawa,M., Iida,Y., Matsuda,A., Tsukiyama,T. and Sato,G. TITLE Phylogenetic relationships between adenoviruses as inferred from nucleotide sequences of inverted terminal repeats JOURNAL Gene 55, 85-93 (1987) STANDARD full staff_entry COMMENT Draft entry and printed copy of sequence for [1] kindly provided by M.Shinagawa, 29-SEP-1987. FEATURES from to/span description rpt 1 196 Ad2 inverted terminal repeat BASE COUNT 38 a 41 c 69 g 48 t ORIGIN Unreported. 1 catcatcaat aatatacagg acaaagaggt gtggcttaaa tttgggtgtt gcaaggggcg 61 gggtcatggg acggtcaggt tcaggtcacg ccctggtcag ggtgttccca cgggaatgtc 121 cagtgacgtg aaaggcgtgg ttttacgaca gggcgacttc cgcggacttt tggccggcgc 181 ccggtttttg ggcgtt // LOCUS ADBPHI 84 bp ds-DNA VRL 15-JUN-1988 DEFINITION Adenovirus type 2 DNA, clone p-phi-4-SVA. ACCESSION M12460 KEYWORDS . SOURCE Adenovirus type 2 DNA. ORGANISM Mastadenovirus 2 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 1 to 84) AUTHORS Lewis,E.D. and Manley,J.L. TITLE Control of adenovirus late promoter expression in two human cell lines JOURNAL Mol. Cell. Biol. 5, 2433-2442 (1985) STANDARD simple staff_entry FEATURES from to/span description mRNA 70 > 84 viral mRNA BASE COUNT 14 a 20 c 30 g 20 t ORIGIN Unreported. 1 ataggtgtag gccacgtgac cgggtgttcc tgaagggggc tataaaaggg ggtgggggcg 61 ttcgtcctca ctctcttccg catc // LOCUS ADBVAI 215 bp ds-DNA VRL 15-MAR-1988 DEFINITION Adenovirus type 2 VAI gene, 5' flank. ACCESSION M10228 KEYWORDS . SOURCE Adenovirus type 2 DNA. ORGANISM Mastadenovirus 2 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 1 to 215) AUTHORS Bhat,R.A., Domer,P.H. and Thimmappaya,B. TITLE Structural requirements of adenovirus VAI RNA for its translation enhancement function JOURNAL Mol. Cell. Biol. 5, 187-196 (1985) STANDARD full staff_entry COMMENT Mutations in the region between positions 53 and 63, and 117 and the 3' end of the gene, result in a considerable loss of activity in the VAI gene. Because no one mutation completely abolishes function, the VAI RNA may have multiple functional sites for its translation modulation function. FEATURES from to/span description RNA 8 > 215 VAI RNA (alt.) RNA 11 > 215 VAI RNA (alt.) BASE COUNT 30 a 66 c 72 g 47 t ORIGIN Unreported. 1 gcctgtaagc gggcactctt ccgtggtctg gtggataaat tcgcaagggt atcatggcgg 61 acgaccgggg ttcgaacccc ggatccggcc gtccgccgtg atccatgcgg ttaccgcccg 121 cgtgtcgaac ccaggtgtgc gacgtcagac aacgggggag cgctcctttt ggcttccttc 181 caggcgcggc ggctgctgcg ctagcttttt tggcc // LOCUS ADBVAIPRO 62 bp ds-DNA VRL 12-MAR-1984 DEFINITION Adenovirus 2 VAI RNA gene promoter region. ACCESSION K00523 KEYWORDS promoter. SOURCE Adenovirus type 2 DNA. ORGANISM Mastadenovirus 2 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 1 to 62) AUTHORS Bhat,R.A., Metz,B. and Thimmappaya,B. TITLE Organization of the noncontiguous promoter components of adenovirus VAI RNA gene is strikingly similar to that of eucaryotic tRNA genes JOURNAL Mol. Cell. Biol. 3, 1996-2005 (1983) STANDARD simple staff_review COMMENT The internal promoter regions were determined by mutants containing deletions, insertions and substitutions. The distance between the A and B regulatory sequences must be longer than 35 base pairs for transcription to take place. FEATURES from to/span description signal 1 9 VAI RNA internal promoter A signal 45 60 VAI RNA internal promoter B BASE COUNT 12 a 15 c 22 g 13 t ORIGIN 1 tccgtggtct ggtggataaa ttcgcaaggg tatcatggcg gacgaccggg gttcgaaccc 61 cg // LOCUS ADZHEX 2849 bp ds-DNA VRL 04-SEP-1984 DEFINITION Bovine adenovirus type 3 (BAV-3) hexon gene. ACCESSION K01264 KEYWORDS hexon. SOURCE Bovine adenovirus type 3 DNA. ORGANISM Mastadenovirus bos3 Viridae; ds-DNA nonenveloped viruses; Adenoviridae. REFERENCE 1 (bases 1 to 2849) AUTHORS Hu,S.-L., Hays,W.W. and Potts,D.E. TITLE Sequence homology between bovine and human adenoviruses JOURNAL J. Virol. 49, 604-608 (1984) STANDARD full staff_review COMMENT [1] compares the predicted amino acid sequences of the bovine adenovirus type 3 with human adenovirus type 2 hexon polypeptides and finds three regions lacking homology. FEATURES from to/span description pept 47 2782 BAV-3 hexon BASE COUNT 673 a 858 c 703 g 615 t ORIGIN 626 bp upstream of BamHI site (at 54.0 map units). 1 ccctctgtgt gacacgtcct cgccagagcg tgattgattg accgagatgg ctaccccgtc 61 gatgctgccg caatggtcct acatgcacat cgccggtcag gacgcgtccg agtacctgtc 121 ccccggcttg gtgcaattcg cacaagccac cgaatcctac tttaacattg ggaacaagtt 181 tagaaacccc accgtcgccc cgacgcacga tgtcaccacg gagcgttcgc agcgtctgca 241 gctccgcttc gtgcccgtag accgggagga cacacagtac tcctacaaaa cccgcttcca 301 gctagccgtg ggcgacaacc gggtgctgga catggccagc acgtattttg acatccgcgg 361 tacgctggac aggggcgcca gtttcaagcc ttacagcggc acggcctaca actcctttgc 421 ccccaagagt gcccctaaca atacgcagtt taggcaggcc aacaacggtc atcctgctca 481 gaccatagct caagcttctt acgtggctac catcggcggt gccaacaatg acttgcaaat 541 gggtgtggac gagcgtcagc tgccggtgta tgcgaacact acgtaccagc cggaacctca 601 gctcggcatt gaaggttgga cagctggatc catggcggtc atcgatcaag caggcgggcg 661 ggttctcagg aaccctactc aaactccctg ctacgggtcc tatgctaagc cgactaacga 721 gcacgggggc attactaaag caaacactca ggtggagaaa aagtactaca gaacagggga 781 caacggtaac ccggaaacag tgttttatac tgaagaggct gacgtgctaa cgcccgacac 841 ccaccttgtt cacgcggtac cggccgcgga tcgggcaaag gtggaggggc tatctcagca 901 cgcagctccc aacaggccga actttatcgg ctttcgggac tgctttgtag gcttgatgta 961 ttataacagc gggggcaacc tgggcgtctt agcgggtcaa tcctctcagc tgaatgccgt 1021 ggtagacctg caagaccgca acactgagct ttcctatcag atgcttcttg caaacacgac 1081 ggacagatcc cgctatttta gcatgtggaa ccaagccatg gactcgtacg acccggaggt 1141 cagggtgata gataacgtgg gcgtagagga cgagatgcct aattactgct ttccgttgtc 1201 gggggttcag attggaaacc gtagccacga ggttcaaaga aaccaacaac agtggcaaaa 1261 tgtagctaat agtgacaaca attacatagg caaggggaac ctaccggcca tggagataaa 1321 tctagcggcc aatctctggc gttccttttt gtacagtaat gtggcgttgt acttgccaga 1381 caaccttaaa ttcacccctc acaacattca actcccgcct aacacgaaca cctacgagta 1441 catgaacggg cgaatccccg ttagcggcct tattgatacg tacgtaaata taggcacgcg 1501 gtggtcgccc gatgtgatgg acaacgtgaa tccctttaac caccaccgca actcgggcct 1561 gcgttaccgc tcccagctgc tgggcaacgg ccgcttctgc gactttcaca ttcaggtgcc 1621 acaaaagttt tttgctattc gaaacctgct tctcctgccc ggcacgtaca cttacgagtg 1681 gtcctttaga aaggacgtaa acatgatcct tcagagcact ctgggcaatg atctgcgggt 1741 cgatggggcc actgttaata ttaccagcgt caacctctac gccagcttct ttcccatgtc 1801 acataacacc gcttccactt tggaagctat gctccgcaac gacactaatg accagtcttt 1861 taatgactat ctctcggcgg ctaacatgtt gtatcccatt ccgcccaatg ccacccaact 1921 gcccatcccc tcacgcaact gggcagcgtt ccgtggctgg agtctcaccc ggctaaaaca 1981 gagggagaca ccggcgctgg ggtccccgtt cgatccctat ttcacctatt cgggcaccat 2041 cccgtacctg gacggcactt tttacctcag ccacaccttt cgcaaggtgg ccatccagtt 2101 tgactcttct gtgacctggc ccggcaatga caggctttta acccctaacg agttcgaaat 2161 aaaaataagt gtggacggtg aaggctacaa cgtggctcag agcaatatga ctaaggactg 2221 gttcctggtg cagatgctag cgaattacaa cataggctac cagggatatc acctgccccc 2281 ggactacaag gacaggacat tttccttcct gcataacttc atacccatgt gccgacaggt 2341 tcccaaccca gcaaccgagg gctactttgg actaggcata gtgaaccata gaacaactcc 2401 ggcttattgg tttcgattct gccgcgctcc gcgcgagggc cacccctacc cccaactggc 2461 cttaccccct cattgggacc cacgccatgc cctccgtgac ccagagagaa agtttctctg 2521 cgaccgcacc ctctggcgaa tccccttctc ctcgaacttc atgtccatgg ggtccctcac 2581 agatctcgga cagaacctac tgtatgccaa tgccgcgcat gccctagaca tgacttttga 2641 gatggatccc atcaatgagc ccactctgct gtacgttctg tttgaggtgt ttgacgtggc 2701 ccgcgttcac cagccccaca gaggcgtgat cgaagtggtg tacttgagaa cgccattctc 2761 agccggcaac gctaccacat aagtgccggc ttccctctca ggccccgcga tgggttctcg 2821 ggaagaggag ctgagattca tccttcacg // LOCUS ECOLON 3002 bp ds-DNA BCT 22-MAR-1990 DEFINITION E.coli ATP-dependent protease La (lon) gene, complete cds. ACCESSION J03896 KEYWORDS heat shock protein; protease. SOURCE E.coli DNA, clone pJMC40. ORGANISM Escherichia coli Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Facultatively anaerobic rods; Enterobacteriaceae. REFERENCE 1 (bases 1 to 3002) AUTHORS Chin,D.T., Goff,S.A., Webster,T., Smith,T. and Goldberg,A.L. TITLE Sequence of the lon gene in Escherichia coli: A heat-shock gene which encodes the ATP-dependent protease La JOURNAL J. Biol. Chem. 263, 11718-11728 (1988) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by D.T.Chin, 22-MAR-1990. FEATURES from to/span description pept 413 2764 protease La (lon) mRNA 340 > 2764 lon mRNA signal 298 301 -35 region signal 322 325 -10 region BASE COUNT 790 a 688 c 846 g 678 t ORIGIN 26 bp upstream of DdeI site. 1 cgtgacgagg cgctggatgc tatcgctaag aaagcgatgg cgcgtaaaac cggtgcccgt 61 ggcctgcgtt ccatcgtaga agccgcactg ctcgatacca tgtacgatct gccgtccatg 121 gaagacgtcg aaaaagtggt tatcgacgag tcggtaattg atggtcaaag caaaccgttg 181 ctgatttatg gcaagccgga agcgcaacag gcatctggtg aataattaac cattcccata 241 caattagtta accaaaaagg ggggatttta tctccccttt aatttttcct ctattctcgg 301 cgttgaatgt gggggaaaca tccccatata ctgacgtaca tgttaataga tggcgtgaag 361 cacagtcgtg tcatctgatt acctggcgga aattaaacta agagagagct ctatgaatcc 421 tgagcgttct gaacgcattg aaatccccgt attgccgctg cgcgatgtgg tggtttatcc 481 gcacatggtc atccccttat ttgtcgggcg ggaaaaatct atccgttgtc tggaagcggc 541 gatggaccat gataaaaaaa ttatgctggt cgcgcagaaa gaagcttcaa cggatgagcc 601 gggtgtaaac gatcttttca ccgtcgggac cgtggcctct atattgcaga tgctgaaact 661 gcctgacggc accgtcaaag tgctggtcga ggggttacag cgcgcgcgta tttctgcgct 721 ctctgacaat ggcgaacact tttctgcgaa ggcggagtat ctggagtcgc cgaccattga 781 tgagcgggaa caggaagtgc tggtgcgtac tgcaatcagc cagttcgaag gctacatcaa 841 gctgaacaaa aaaatcccac cagaagtgct gacgtcgctg aatagcatcg acgatccggc 901 gcgtctggcg gataccattg ctgcacatat gccgctgaaa ctggctgaca aacagtctgt 961 tctggagatg tccgacgtta acgaacgtct ggaatatctg atggcaatga tggaatcgga 1021 aatcgatctg ctgcaggttg agaaacgcat tcgcaaccgc gttaaaaagc agatggagaa 1081 atcccagcgt gagtactatc tgaacgagca aatgaaagct attcagaaag aactcggtga 1141 aatggacgac gcgccggacg aaaacgaagc cctgaagcgc aaaatcgacg cggcgaagat 1201 gcgaaagagg caaaagagaa agcggacagg agttgcagaa gctgaaaatg atgtctccga 1261 tgtcggcaga agcgaccgta gtgcgtggtt atatcgactg gatggtacag gtgccgtgga 1321 atgcgcgtac gaaggtcaaa aaagacctgc gtcaggcgca gaaatccttg ataccgacca 1381 ttatggtctg gagcgcgtga aagatcgaat ccttgagtat cttgcggttc aaagccgtgt 1441 caacaaaatc aagggaccga tcctctgcct ggtagggccg ccgggggtag gtaaaacctc 1501 tcttggtcag tccattgcca aagccaccgg gcgtaaatat gtccgtatgg cgctgggcgg 1561 cgtgcgtgat gaagcggaaa tccgtggtca ccgccgtact tacatcggtt ctatgccggg 1621 taaactgatc cagaaaatgg cgaaagtggg cgtgaaaaac ccgctgttcc tgctcgatga 1681 gatcgacaaa atgtcttctg acatgcgtgg cgatccggcc tctgcactgc ttgaagtgct 1741 ggatccagag cagaacgtag cgttcagcga ccactacctg gaagtggatt acgatctcag 1801 cgacgtgatg tttgtcgcga cgtcgaactc catgaacatt ccggcaccgc tgctggatcg 1861 tatggaagtg attcgcctct ccggttatac cgaagatgaa aaactgaaca tcgccaaacg 1921 tcacctgctg ccgaagcaga ttgaacgtaa tgcactgaaa aaaggtgagc tgaccgtcga 1981 cgatagcgcc attatcggca ttattcgtta ctacacccgt gagcgggcgt gcgtggtctg 2041 gagcgtgaaa tctccaaact gtgtcgcaaa gcggttaagc agttactgct cgataacgtc 2101 attaaaacat atcgaaatta acggcgataa cctgcatgac tatctcggtg ttcagcgttt 2161 cgactatggt cgcgcggata acgaaaaccg tgtcggtcag gtaaccggtc tggcgtggac 2221 ggaagtgggc ggtgacttgc tgaccattga aaccgcatgt gttccgggta aaggcaaact 2281 gacctatacc ggttcgctcg gcgaagtgat gcaggagtcc attcaggcgg cgttaacggt 2341 ggttcgtgcg cgtgcggaaa aactggggat caaccctgat ttttacgaaa aacgtgacat 2401 ccacgtccac gtaccggaag gtgcgacgcc gaaagatggt ccgagtgccg gtattgctat 2461 gtgcaccgcg ctggtttctt gcctgaccgg taacccggtt cgtgccgatg tggcaatgac 2521 cggtgagatc actctgcgtg gtcaggtact gccgatcggt ggtttgaaag aaaaactcct 2581 ggcagcgcat cgcggcggga ttaaaacagt gctaattccg ttcgaaaata aacgcgatct 2641 ggaagagatt cctgacaacg taattgccga tctggacatt catcctgtga agcgcattga 2701 ggaagttctg actctggcgc tgcaaaatga accgtctggt atgcaggttg tgactgcaaa 2761 atagtgacct cgcgcaaaat gcactaataa aaacagggct ggcaggctaa ttcgcgttgc 2821 cagccttttt ttgtctcgct aagttagatg gcggatcggg cttgccctta ttaaggggtg 2881 ttgtaagggg atggctggcc tgatataact cgtgcgcgtt cgtaccttga aggattcaag 2941 tgcgatataa attataaaga ggaagagaag agtgaataaa tctcaattga tcgacaagat 3001 tg // LOCUS ECOMUTZ 158 bp ds-DNA SYN 12-FEB-1990 DEFINITION E.coli multiple cloning site. ACCESSION M31881 J05270 KEYWORDS . SOURCE E.coli and synthetic DNA. ORGANISM Artificial gene Artificial sequences; Genes. REFERENCE 1 (bases 1 to 158) AUTHORS Chan,P.T. and Lebowitz,J. TITLE Site directed mutagenesis of lacUV5 promoter JOURNAL J. Biol. Chem. (1900) In press STANDARD simple staff_entry COMMENT Draft entry and computer-readable sequence [1] kindly submitted by J. Lebowitz, 29-JAN-1990. BASE COUNT 47 a 44 c 33 g 34 t ORIGIN 1 ccctttcgtc ttcaagaatt cccgggatcc gtcgacctgc agatctctag aagcttctag 61 agatcttcca tacctaccaa ttctgcgcct gcagcaatgc aacaacatta cccggatcaa 121 tcaggggata acgcaggaaa gaacatgtga gcaaaagg // LOCUS HAMADBJN 683 bp ds-DNA ROD 15-JUN-1988 DEFINITION Adenovirus type 2/hamster cell right junction. ACCESSION M12466 KEYWORDS recombinant joint. SOURCE Hamster cell line HE5 derived from Ad2 infected embryo cells DNA. ORGANISM Mesocricetus sp. Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Cricetidae; Cricetinae; Cricetini. REFERENCE 1 (bases 1 to 683) AUTHORS Gahlmann,R., Leisten,R., Vardimon,L. and Doerfler,W. TITLE Patch homologies and the integration of adenovirus DNA in mammalian cells JOURNAL EMBO J. 1, 1101-1104 (1982) STANDARD simple staff_entry FEATURES from to/span description recomb 282 283 Ad2 DNA end/hamster DNA start BASE COUNT 223 a 182 c 100 g 178 t ORIGIN 297 bp upstream of DdeI site. 1 aaaaatgacg taacggttaa agtccacaaa aaacacccag aaaaccgcac gcgaacctac 61 gcccagaaac gaaagccaaa aaacccacaa cttcctcaaa tcttcacttc cgttttccca 121 cgatacgtca cttcccattt taaaaaaact acaattccca atacatgcaa gttactccgc 181 cctaaaacct acgtcacccg ccccgttccc acgccccgcg ccacgtcaca aactccaccc 241 cctcattatc atattggctt caatccaaaa taaggtatat tactctcatc tattgtctaa 301 gtaaaaacta aattcatgaa gaatattcat ttttaagagc atagatttct gaattagaaa 361 aaagttgttt ttgttctgtt ttggataaaa tcttgctaca taacccaggt taaactcaaa 421 ctcagggtcc tcctgtctca gcctccagct gttataaaat ctaaattcta cccactcact 481 acagcaggga gtgggggcgc acacagggat ggtggaccct aggtagctaa catactatga 541 ccagccacat ttgtacagtg gggtcacagc tgtagtatct taaagaaaac ttgtcaaggg 601 actcatggca aaaaacaccc cacaaacttc aattagttcc ttcgactttt tgaaacagtt 661 actttgtttt tctgacactt taa // LOCUS HAMADBL1 151 bp ds-DNA ROD 22-SEP-1986 DEFINITION Adenovirus type 2/hamster DNA, left junction. ACCESSION M12409 KEYWORDS recombination joint. SEGMENT 1 of 2 SOURCE Hamster LSH embryo cell line HE5 DNA, clone lambda-L1, subclone pXba4. ORGANISM Mesocricetus sp. Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Cricetidae; Cricetinae; Cricetini. REFERENCE 1 (bases 1 to 151) AUTHORS Gahlmann,R. and Doerfler,W. TITLE Integration of viral DNA into the genome of the adenovirus type 2-transformed hamster cell line HE5 without loss or alteration of cellular nucleotides JOURNAL Nucleic Acids Res. 11, 7347-7361 (1983) STANDARD full staff_review COMMENT Draft entry and clean copy sequence for [1] kindly provided by R.Gahlmann and W.Doerfler, 15-JUL-1986. See comment in segment 2. FEATURES from to/span description recomb 131 132 cellular DNA end/Ad2 DNA start BASE COUNT 46 a 29 c 26 g 50 t ORIGIN Unreported. 1 accttcaaca aatgaacagc acagattaag cataatgctg cctgaccatc attttattac 61 tactaaaatc ccctttgctc tctatttcat ggtggggtag tcattatggg aatggaggta 121 aaacagctta tatatacctt attttggatt g // LOCUS HAMADBL2 141 bp ds-DNA ROD 22-SEP-1986 DEFINITION Hamster/adenovirus type 2 DNA, right junction. ACCESSION M12408 KEYWORDS recombination joint. SEGMENT 2 of 2 SOURCE Hamster LSH embryo cell line HE5 DNA, clone lambda-24, subclone p24. ORGANISM Mesocricetus sp. Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Myomorpha; Cricetidae; Cricetinae; Cricetini. REFERENCE 1 (bases 1 to 141) AUTHORS Gahlmann,R. and Doerfler,W. TITLE Integration of viral DNA into the genome of the adenovirus type 2-transformed hamster cell line HE5 without loss or alteration of cellular nucleotides JOURNAL Nucleic Acids Res. 11, 7347-7361 (1983) STANDARD full staff_review COMMENT Draft entry and clean copy sequence for [1] kindly provided by R.Gahlmann and W.Doerfler, 15-JUL-1986. The terminal ten and eight nucleotides of the Ad2 DNA were deleted at the left and right sites of junction, respectively. The integrated viral DNA had an internal deletion between map units 35 and 82 on the Ad2 genome. FEATURES from to/span description recomb 22 23 Ad2 DNA end/hamster DNA start BASE COUNT 52 a 17 c 18 g 54 t ORIGIN About 55 kb after segment 1. 1 caatccaaaa taaggtatat tactctcatc tattgtctaa gtaaaaacta aattcatgaa 61 gaatattcat ttttaagagc atagatttct gaattagaaa aaagttgttt ttgttctgtt 121 ttggataaaa tcttgctaca t // LOCUS MLCRR16S 1234 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.capsulatus 16S rRNA. ACCESSION M29023 KEYWORDS 16S ribosomal RNA. SOURCE M.capsulatus (strain BATH) rRNA. ORGANISM Methylococcus capsulatus Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Acidiphilium cryptum; Methylococcaceae. REFERENCE 1 (bases 1 to 1234) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1234 16S rRNA BASE COUNT 305 a 275 c 364 g 250 t 40 others ORIGIN 1 accvgagagt cgacctggct cagatcgaac gctggcggca tgcnaacaca tgcaagtcga 61 acggtaacag cgcntccagg gcgctgcgag tggcggacgg gtgagttaat gcgtaggaat 121 ctrccttgtt gtgggggata actcggggaa acccragcta ataccgcata cgcctcacgg 181 aggaaagcgg gggatcttcg gacctcgcgc aataagatga gcctacgtcg gattagctag 241 ntngtggggt naaggccnac caacgcgacg atccnnagct ngtctgagag gacgatcagc 301 cacactncna ctgagagacg atccagnctc tacggaggca gcagtgggaa tattggacaa 361 tgggcgcaag cctgatccag caatgccgcg tgtgtgagag aaggctncgg gttgtnaagc 421 actttaagca gggaagaagg gtggggtgtt aataccatct cacattgacg ttacctgcag 481 aataagcacc ggctagctcg gtgccagcag cgcgtaatac gagngtgcaa gcgttaatcg 541 gaattactgg gcgtaaagcc acgttatcan tttcataagt ctcratgtva acctnggcnn 601 aacmtmnnaa cncatanata ctcgctatct aggtagtgga ttcgagtgta gagtgcaatg 661 cgtagagatc cgagaacacc agtgcgaagg cggctccttg gaccaacact gacgctgagg 721 tgcgaaagcg tggggagcaa acaggataga taccctngta gtccacgctg taaacgatgt 781 caactagccg ttggaggggt tnaaccnntt agtggcgcag ctaacgcgat aagttgacng 841 cctggggagt acggccgcaa ggttaaaact ctaaatgaat tgtacggggc cgcactaagc 901 ggggagtcat gtngatttaa ttcgatnaac gcgaagaacc ttacctggcc ttgacatgct 961 tggaatcctg cagagatgcg gggtgcttcg ggagccaaga cacaggtact gcatggctnt 1021 cgtcagctcg tgtcgtgaga tgttgggtta attagcctta tggcttaggg ctwcgctaca 1081 gagctggtac agtgggtgac gtagccgtca gggttggtag cctaagagct aatctccaga 1141 aagctcttag tccggattgc agtctgcaac tcgactgcat gaagtcggaa tcgctagtaa 1201 tcgcggatca gcatgctgcg gtnaatacgt tccc // LOCUS MTBRR16S 1353 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.extorquens 16S rRNA. ACCESSION M29027 KEYWORDS 16S ribosomal RNA. SOURCE M.extorquens (strain AM1) rRNA. ORGANISM Methylobacterium extorquens Prokaryota; Bacteria; Methylmonadaceae. REFERENCE 1 (bases 1 to 1353) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1353 16S rRNA BASE COUNT 334 a 305 c 410 g 276 t 28 others ORIGIN 1 tcaactcgag agtttgatcc tggctcagag cgaacgctgg cggcatggcn taacacatgc 61 aagtcgaacg ggcacttcgg gtgtcagtgg catagacggg tgagtactaa ctacgtggga 121 acgtaccttc ggttcggaat aactcaggga aacthnagct aataccggat acgccttttg 181 gggaaaggtt tactcggaag gatcggcccg cgtccgatta gctggttcgt gaggtaacgg 241 cctaccaagg cgacgatcag tagctagtct gagaggatga tcagcaacac tgggactgag 301 acacggccca gactcctacg cggggggcng cacgtcgggg aatattggac aatgggcgca 361 agcctgatcc agccatgccg cgtgacgtga tgaaggcctt agggttgtaa agctctttca 421 tcggacgata aggmctgccg gttaataccc ggtagactga cattacctat acaagaagcc 481 cgggctaact ccgtgtcagc tagtcggwta athcnaaggg gnnnagcntt gctcggaatc 541 nctgggcgta aaggggcgth ggcngaanat thagtcgggg gtgaaagcct atactcaacc 601 achgaattgc cttcgatact ggttatcttn agacaggaag aggacagcgg actgcgagtg 661 tagaggtvaa atacgtagat attcgcaaga acaccagtgg cgaaagwggc tctrtgccga 721 nactgacgct gaggtcgtaa agcgtgggga gcaaacagga ttagataccc tggtagtcca 781 cgccgtaaac gatgaatgct caaggttggc tgcttgcagg ttagtggtgg agctaacgca 841 ttaagttgac cgcctgggga gtacggcgca aggcttaaaa ctcaaatgaa ttgacggggg 901 ctccgcacaa gcggtggagc atgtggttta attcgatgna acgcgaagaa ccttacctac 961 ccttgacatc canagaatct tgtagagata gcggagtgcc ttcgggcgct ctgagacagg 1021 tgctgcatgg ctgtcgtcag ctcgtgtcgt gagatgttgg gttaagtccc gcaacgagcg 1081 caacccacgt cnacgtagtt agcatcattc agttgggcac tctagggaga ctgccggtga 1141 taagccgcga ggaaggtgtg gatgacgtga caagtcctca atacttacgg gatgggctac 1201 acacgtgcta caatggcggt gacagtggga cgaagccgcg aggtrgagct aatccccaaa 1261 agccgtctca gttcggattg cactctgcaa ctcgagtgca tgaagtcgga atcgctagta 1321 atcgcggatc agcatgccgc ggtnaatacg ttc // LOCUS MTBRR16SA 1052 bp ss-rRNA RNA 30-OCT-1989 DEFINITION Methylobacterium sp. 16S rRNA. ACCESSION M29029 KEYWORDS 16S ribosomal RNA. SOURCE Methylobacterium sp. (strain DM4) rRNA. ORGANISM Methylobacterium sp. Prokaryota; Bacteria; Methylmonadaceae. REFERENCE 1 (bases 1 to 1052) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1052 16S rRNA BASE COUNT 233 a 218 c 324 g 196 t 81 others ORIGIN 1 tcaacttgag agtttgatcc tggctcagag cgaacgctgg cggcaggctt aacacatgca 61 agtcgaacgg gcactncggg nagtggcaga cgggtgagta acacgtggga acgtaccttc 121 ggttcggaat aactcaggga aactnnagct aataccggat acgccntttg gggaaaggtt 181 tatcgcgaag gatcggcnnr cgtctgatta gcttgttggt gagggtaavg gcctaccaag 241 gcgacgatca gtagctngtc tgagaggatg atcagccacn ctgggactga ggacnatggg 301 cgcaagccnn atccagccat gccgngngag tgatgaaggc cnnagggtng taaagcnnnn 361 ttgtccggga cgataatgac ggtaccggaa gaataacccc ggctnacttc gggattgcct 421 tcgatactng tkggcttgag accggaagag gacagcggaa ctgcgagtgt agaggtvaaa 481 ttngtagata ttcgcaagaa caccagtggc gaaggcggcn tactggtccn nnnctgacgc 541 tgaggcgnna aagcgtgggg agcaaacagg atnagatacc ctngtagtcc acgcngtaaa 601 cgatgaatgg ccncggtngg cctgcttgca ggtnagtggc gccgnnacgc attaagcatt 661 rcgrcctggg agtacggtcg caagattraa ncccatccct tantggcatk ttacckkgrg 721 agatgggnnt ctsttcggag gngtgacacn ggtgcngcat ggctntcgtn agnnngtgtc 781 gtgagatgtt gggtnaaagt tgccatcatt cagttgggca ctctagggag actgcnggtg 841 ataagccgcg aggaaggtgt ggatgacgnn aagtcmdcat ggndacggga tnggctacac 901 acgtgctaca atggcggtga cagtggganc gagccgcgag gtggagcaaa tcccnaaaaa 961 ccgtctcngt tcggattgcn ctctgcaact cgggtgcatg aaggcggnat cgctngtnat 1021 cgtggatcag cacgccgcgg tnnatacgtt cc // LOCUS MTBRR16SB 1316 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.organophilum 16S rRNA. ACCESSION M29028 KEYWORDS 16S ribosomal RNA. SOURCE M.organophilum (strain XX) rRNA. ORGANISM Methylobacterium organophilum Prokaryota; Bacteria; Methylmonadaceae. REFERENCE 1 (bases 1 to 1316) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1316 16S rRNA BASE COUNT 313 a 298 c 402 g 258 t 45 others ORIGIN 1 caacttgtga gtttgatcct ggctcagagc gaacgctggc ggcaggcnna acacatgcaa 61 gtngaacgca tctcggnatn agtngcagac nggtgagtaa cacgtggaac gtgacccttt 121 rgttcggaat aactcaggga aactnnagct nataccggct acgcccacaa ggggaaaggt 181 ttactncgga aggatgagcc cgcgtctgat tagctagttg gtggggtaac ggcctaccaa 241 ggcgacgatc agtagctngt ctgagaggat gatcagccac actrggactg agacacggcc 301 cngcctcgta ggggaggcag cagtggggaa tattggacaa tgggcgcaag cctnatccag 361 ccatgccgcg tgagtgatga aggccttagg gttgtaaagc tnttttgtcc gggacgataa 421 tgacggtacc ggaagaataa ccccggcync tcttcgtgcc agcagccgcg gtaatacgaa 481 gggggcttac gttgctcgga atcactgggc gtaaaggncn cgtaggcggc catttaantc 541 gggggtgnna acctagggct caaccacaga attgccttcg atactgggta tctttgagtc 601 cggaagaggt tggtggaact gcgagtgtag aggtgaaatn cgtagatatt cgcaagaaca 661 ccngtggcga aggcggcnaa ctggtccrnn actgagcctg aggcgcnaaa gcgtggggag 721 caaacaggat tagataccct ggtagtccac gccgtaaacg atgaatgcca gccgttggga 781 gctttgctgc tcagtgncgc agccaacgct ttgagcnttc cgcctnggga gtacggtcgc 841 aagattnaaa ctcaaaggaa ttgacggggg cccacaagcg gtggagcatg tggtttaatt 901 cgaannaacg cgcagaacct taccatccct tgacatggca tgttacccag agagatttgg 961 gncttcctct tcggaggcgt gcacacaggt gctgcatggc tntcgtcagc tcgtgtcgtg 1021 agatgttggg taaagtcccg caacgagcgc aacccacgtc ctnagttgcn atcattcagt 1081 tgggcactct agggagactg cgggtgataa gccgcgagga aggtgtggat gacgtcataa 1141 gtcctnatgc ttacgggatg ggctacacac gtgctacaat ggcggtgaca gagggaggcg 1201 agagggngac ctngagcaaa tcccgaaaaa ccgtctcagt tcggattgca ctctgcaact 1261 gggtgcatga aggcggaatc gctagtaatc gtggatcagc atgccacggt naatac // LOCUS MTCRR16SA 1314 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.parvus 16S rRNA. ACCESSION M29026 KEYWORDS 16S ribosomal RNA. SOURCE M.parvus (strain OBBP) rRNA. ORGANISM Methylocystis parvus Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Budding and/or appendaged bacteria. REFERENCE 1 (bases 1 to 1314) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1314 16S rRNA BASE COUNT 328 a 287 c 404 g 262 t 33 others ORIGIN 1 ccaacttgag agtttgatcc tggctcagaa cgaacgctgg cggcaggcct aacacatgcg 61 agtcgaacgc tgtagcaata cagagtggca gacgggtgag taacgcgtgg gaacgtgtga 121 ctttcaggtt cggaataact cagggaaact wnagctaata ccggatacgc ctattggggg 181 aaagatttat tcgcgaaaga tcagcccgcg tccgattagc tagttggtgt ggtaatggcg 241 cacnaaggcg acgatcggna gctngtctga gaggatgatc agccacactg ggactgagac 301 acggccnnga ctctaacggg aggcagcagt ggggaagatt ggacagatgg cgcaagcctg 361 atccagccat gccgcgtgag tgatgaaggc aaaagggttg taaagctntt tcgccaggga 421 cgataatrac ggtacctgga taagaagccc cggctaactt cgtgccagca gccgcggtaa 481 tacgaagggg gctagcgttg tacggaatca ctgggcgtaa agcgcacgta ggcggatctt 541 taagtnaggg gtgaaatccg aggctcaacc tcggaactgc ctttgatact ggmggtctcg 601 agttnnggag aggtaagtgg actcgagtgt agaggtgatc gtagatatcg caagacacag 661 tggcgaaggc gntcactgcn ctgacgctga ggtgnnagcg tggggagcaa acaggattag 721 ataccctggt agtccacgcn gtnaacgatg gatgctngcc gttgggagct atgctnttca 781 gtggcgcacg taacgcttta agcatcngcc tggacggaat gatctcaaga ttaaaactnn 841 agctaatact ggatgacggg cctctaccaa aaagagthat ntngtttaat ccvacannnc 901 gcgcagaacc ttacctgctt ttgacatcgc cggtatgatg ccagagatgg acgatcttaa 961 cgcagggggc gagcacacag gtgctgcatg gctgtcgtca gctcgtgtcg tgagatgttg 1021 ggttaagtcc agcaacgagc gcaacctcgc cactaagtat gcatcattca gttgggtaca 1081 ctatggggnc tgccggtgat aagccgagag gaaggtgggg atgacgtcaa gtcctcatgc 1141 gcttacaggc tgggctacac acgtgctaca atggcggtga ccatggggag cgagaagggc 1201 gacctggagc aaatctcaaa aagccgtctc ngttcggatt gcactctgca actcgagtgc 1261 atgaaggtgg aatcgctagt aatcgcagat cagcacgctg cggtgaatac gttc // LOCUS MTERR16S 1306 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.methanica 16S rRNA. ACCESSION M29025 KEYWORDS 16S ribosomal RNA. SOURCE M.methanica (strain 81Z) rRNA. ORGANISM Methylosporovibrio methanica Prokaryota; Bacteria. REFERENCE 1 (bases 1 to 1306) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1306 16S rRNA BASE COUNT 314 a 293 c 397 g 267 t 35 others ORIGIN 1 caacttgaga gtttgatcct ggctcagaac gaacgctggc ggcaggctta acacatgcaa 61 gtcgaacgct gtagcaatac gagtggcnga cgggtgagta acgcgtggga acgtactttc 121 ggttcggaat aactcaggga aactnnagct aataccggat acgcctatcg ggggaaagat 181 ttattcgcga aagatcggcc cgcgtccgat tagctagttg gtgaggtaat ggctcaccaa 241 ggcgacgatc ggnagctggt ctgagaggat gatcagccac actgggactg agacacggcc 301 catgctcgtg cgggaggcag cagtggggaa tattggacga tgggcgcaag cctnatccag 361 ccatgccgcg tgagtgatga aggccttagg gttgtaaagc tctttcgcga gggacgataa 421 tgacggtacc tggataagaa ccccggctna cttcgtgcca gctgccgcgg taatacgaag 481 ggggcaagca tagttcggaa tcgctgggcg taaagcgcac gtaggcgnat ctttaagtca 541 ggggtgaaat ccgaggctca acctcggnac tgcctttgat actgggtgty tagagttgag 601 tatgctagat ctgagnngat cnaggactgt agtgtagaat gatctagata ttcgcaagaa 661 cagcagtgcg aaggcgncta ctgcnnctga cgctgaggtc nnacgtgggg agcaaacagg 721 attagatacc ctngtagtcc acgccgtaaa cgatggatgc tagccgttgg gcacttgctg 781 ttcagtggcg cactaacgct ttaagcatcc gcctggggag tacggtcgca agattaaaac 841 acnagggaat ggactggggc ggcgtgcagc ngtgcagctg tngtnnnatm cgannnccgc 901 gcagaacctt accagctttt gacattctag tatggtcgac agagatgtct tccttccgca 961 aggggctaga acacaggtrc tgcatngctn cgtcagchcg tatcgtgaga tgttgggtta 1021 agtccncaac gagcgcaacc tcacccttag ttgccatcat tcagttgggc actctaggnn 1081 aactgcggtg ataagccgag aggaaggtgg ggatgacgtc aagtcctcat ggcgcttaca 1141 ggctgggcta cacacgtgct acaatggcgg tgacaatggg atgcgaaggg gcgacccgga 1201 gcaaatctcc aaaaagcgtc tcagttcgga ttgcactctg caactcgagt gcatgaaggt 1261 ggaatcgcta gtaatcgcag atcagcacgc tgcggtgaat acgttc // LOCUS MTHRR16S 1504 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.methylotrophus 16S rRNA. ACCESSION M29021 KEYWORDS 16S ribosomal RNA. SOURCE M.methylotrophus (strain AS1) rRNA. ORGANISM Methylophilus methylotrophus Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Acidiphilium cryptum; Methylococcaceae. REFERENCE 1 (bases 1 to 1504) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1504 16S rRNA BASE COUNT 410 a 315 c 453 g 308 t 18 others ORIGIN 1 tgaacttgag agtttgatcc tggctcagat tgaacgctgg cggaatgctt aacacatgct 61 agtcgaacga tcaaccttag cttgctaagg agtggcggac gggcgagtac taatctcgga 121 cgtgccttgt cgtgggggac aactagtcga aagatcagct aataccgcat acgcctgagg 181 gggaaagcgg gggatcttcg gaccaacgtt aataagagcg gccgatgtct gatnagctag 241 ttggtggggt aatggcctac caaggcgacg atcagtagct rgtctgagag gacgaccagc 301 cacactggaa ctragacacg gtccagactc trcgggaggc agcagtgggg aattttggac 361 aatgggcgca agcctgatcc agccvttccg cgtgagtgaa gaaggcttcg ggttgtaaag 421 ctvtttcgcr agggaagaaa acttacattc taataaagtg tgaggctgac ggtacctaga 481 taagaagcac cggctaactt cgtgccagca gccgcaataa tacgaagggt gcaagcgtaa 541 atcggaatta ctgggcgtaa agcgtcgcag gcggtttagt aagtcagatg tgaaatcccc 601 gagctcaact hgggtactgc gtttgaaact acaaagcttg aatatgtcag aggggggtcg 661 aattccacgt gtagagtgaa atgcgtggag atgtggagga ataccagtgg cgaaggcggc 721 ccctgggata atattgacgc tgaggtgcgn aagcgtgggg agcaaacagg attagatacc 781 ctngtagtcc acgccctnaa cgatgtctac tagttgttgg tggagtaaaa tccatgagta 841 acgcacgtaa cgcgtgaagt agaccgcctg gggagtacgg tcgcaagatt aaaactcaaa 901 ggaattgacg ggggctcgca cagagcggtg gattagtggt gattgattcg atgcaacgcg 961 aaaaacctta cctggccttg acatgccact aacgaagcag agatgcatta ggtgccgtaa 1021 gggaaagtgg acacaggtgc tgcatggctg tcgtcagctc gtgtcgtgag atgttgggtt 1081 aagccagcaa cgagcgcaac caaagccata aatatagcca tcattagtag ggcactatta 1141 atggactgcc ggtgacaaac cggaggaagg taagggatga cgtcgtaagt cctcatgccc 1201 taatggaggc agggcttcac acgtaataca atggtcggta cagagagttg ccaaccgccg 1261 agggggaagc taatctcaga aagccnatcg tagtccggat tgtactctgc aactcgagag 1321 catgaagtcg gaatcgctag taatcgcgga tcagcatgtc gcggtgaata cgttcccggg 1381 cctagtacac accgcccgtc acaccatggg agtvggtatt accagaagta gttagtctaa 1441 ccgtaagggg gavattacca cggtagtatt catgactggg gtnaagtran nacaaggtag 1501 ccgt // LOCUS MTLRR16S 1456 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.trichosporium 16S rRNA. ACCESSION M29024 KEYWORDS 16S ribosomal RNA. SOURCE M.trichosporium (strain OB3b) rRNA. ORGANISM Methylosinus trichosporium Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Acidiphilium cryptum; Methylococcaceae. REFERENCE 1 (bases 1 to 1456) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1456 16S rRNA BASE COUNT 348 a 330 c 447 g 294 t 37 others ORIGIN 1 tgaacttgag agtttnatcc tggctcagaa cgaacgctgg cggcaggctt aacacatgca 61 agtcgaacgg gcgcagcgca tgcgtagaca gtggcagacg ggtgagtaac gcgtgggaac 121 gtactttcgg ttcggaataa ctcagggaaa ctwaagctaa taccggatac gccttaaggg 181 ggaaagattt attcgcgaaa gatcggcccg cgtccgatta gctagctgtt ggtgaggtna 241 aggctnacca aggcgacgat cggnagctng tctgagagga tgatcagcka cactgggact 301 gagacacggc ccagactcta cggaggcagc agtgggaata ttggacaatg ggcgcaagcc 361 tgatccagcc atgccgcgtg agtgatgaag gccttagggt tgtaaagctc tttcgccagg 421 gacgataatg acggtacctg gataagaacc ccggctaact tcgtgccagc agccgcggta 481 atacgaaggg ggcaagcgta gttcggaatc gctgggcgta aagcgcacgt aggcggattg 541 ttcagtccgg ggtgaaatcc gaggctcaac ctcggaactg cctnnnatmc tgcgtatctt 601 gagtcgggaa gaggtaagta tgactcgagt gtagagtgaa ttcgtagata ttcgcagaac 661 accagtggcg aaggcggcta ctggcccgnn actgacgctg aggtgcgnaa gcgtggggag 721 caaacaggat tagataccct ggtagtccac gccgtaaacg atggatgcta gccgttgggg 781 agcttgctgt tcagtggcgc aggtaacgct ttaagcattc cgcctgggga gtacggtcgc 841 aagattaaaa ccgctaggag ttrttggggg tcccaccgca gcggtggagt nanntngtta 901 atntaanncc gcgcagaacc ttaccagctt ttgacatgtc cggtatggtg gtcagagatg 961 acttccttcc gcaaggggcc ggatacaggt gctgcatngc tntcgtcagn tcgtgtcgtg 1021 agatgttggg ttaagtcccg caacgagcgc aaccatcagc cgacttagtt gccatcattc 1081 agttgggcac tctagggana ctgccggtga taatccgacg aggaaggtag gggatgacgt 1141 caagtcctma tgcgcttacg aggctvggct acacacgtgc tacaatggcg gtgacaatgg 1201 gaacgacgam gtgacgccga gctaatctcc aaaagccgtc tcagttcgga ttgcactctg 1261 caactcgagt gcatgaaggt ggaatcgcta gtaatcgcag atcagcacgc tgcggtgaat 1321 acgttccctg gccttgtaca cacccncgtc acaccatggg agtnggtttt gccagaagta 1381 gttagcctaa ccgcaaggag ggcnattacc acggcagggt tcatgactgg ggtnaagtnn 1441 nnacaaggta gccgta // LOCUS MTMRR16S 1283 bp ss-rRNA RNA 30-OCT-1989 DEFINITION M.methanica 16S rRNA. ACCESSION M29022 KEYWORDS 16S ribosomal RNA. SOURCE M.methanica (strain 81Z) rRNA. ORGANISM Methylomonas methanica Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Acidiphilium cryptum; Methylococcaceae. REFERENCE 1 (bases 1 to 1283) AUTHORS Tsuji,K., Tsien,H.C., Hanson,R.S., DePalma,S.R., Scholtz,R. and LaRoche,S. TITLE 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs JOURNAL J. Gen. Microbiol. 136, 1-10 (1989) STANDARD full staff_entry COMMENT Draft entry and computer-readable sequence for [1] kindly submitted by K.Tsuji, 14-OCT-1989. FEATURES from to/span description rRNA 1 1283 16S rRNA BASE COUNT 319 a 284 c 385 g 253 t 42 others ORIGIN 1 caactngaga gtttgatcct ggctcagagc gaacgctggc ggcaggctta acacatgcaa 61 gtcgaacggt aacaggcntc caggcgctga acgagtggcg gacgggtgag ttaactgcgt 121 aggaatctrc cttrttgtgg gggataactm ggggaaaccc nagctaatac cgatacgctc 181 cacggtggaa agcgggggat cttcggacct cgcgcaataa gatgagccta tgttggatta 241 gctagttggt agggtaacgg cctaccaagg cgacgatcca tagctggtct gagaggatga 301 tcagcnacat tgggactgag aggaatattg aacaatgggc gcaaagctga tccagccatg 361 ccgcgtgagt gatgaagggc ttaggnttgn aaaggtcttt crtnaggcaa gaaggaatac 421 trcdgttaat accggcngac tvcgattacc tatacnagaa gcaccggcta acttcgtgcc 481 agcagcngcg gtnatacgga gngtgcnagc gttnatcgga attactgggc gnnaagcgtg 541 cgtaggcggn nvaggtccac gcgatgtaag cctngacctg gacctcggaa ctgcctttga 601 tactgggtat cttgagaccg gaagagacag cggactcgag tgtagagtcg aatacgtaga 661 tatcgcaaga acaccagtgg cgaagcgcct ctctggtcca agactgacgc tgaggtgccn 721 aagcgtgggg agcaaacagg attagatacc ctggtagtcc acgccgtaaa cgatgaatgc 781 tagctgttgg gcttcttgga cttcagtggc ggagctaacg ctttaagtag accgcctggg 841 gagtacggcg gcaaggcnaa aactnaaatn aattgacggg ggcascccca cagcggntgg 901 accttgtgtn naattcgatg naacgcgaag aacctnacct accctngaca tccanagaat 961 ctnttagaga tagcggagtg cttcgggagc tctgaacagg tgctgcatgg ctntcgtcag 1021 ctcgtgttgt gaaatgttgg gttaagggca ctctacgaga ctgccggtgc atacgcggag 1081 gcacaggtcc cacgacgtca agtcataatg gccttatggg tagggctaca acgtgctaca 1141 atggccgtga cagagggaag ggacatcgcg agagtaagct aatcccaaaa agcggtctca 1201 gtccggattg cagtctgcaa ctcgactgcg tgaagtcgga atcgctagta atcgcggatc 1261 agcatgccgc ggtnaatacg ttc // LOCUS SIVHTLV4A 312 bp ds-DNA VRL 15-JUN-1988 DEFINITION Simian immunodificiency proviral DNA long terminal repeat. ACCESSION Y00269 KEYWORDS long terminal repeat; provirus. SOURCE Simian immunodeficiency virus DNA (HUT-78 cells). ORGANISM Simian immunodeficiency virus Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; Retroviridae; Lentivirinae. REFERENCE 1 (bases 1 to 312; enum. 1 to 312) AUTHORS Mullins,J.I. JOURNAL Unpublished (1987) Harvad School of Public Health, Boston, MA 02115 STANDARD simple automatic REFERENCE 2 (bases 1 to 312) AUTHORS Kornfeld,H., Riedel,N., Viglianti,G.A., Hirsch,V. and Mullins,J.I. TITLE Cloning of HTLV-4 and its relation to simian and human immunodeficience viruses JOURNAL Nature 326, 610-613 (1987) STANDARD simple automatic COMMENT *source: clone=BK.28; cell line=PK-289; EMBL features not translated to GenBank features: key from to description SITE 34 44 motif found in SV40 enhancer, K-3 light chain Ig enhancer and HIV-I SITE 48 57 pot. SPI transcription factor bindin sequence SITE 59 68 pot. SPI transcription factor bindin site SITE 70 79 pot. SPI transcription factor bindin site PRM 96 104 TATA-box SITE 125 126 U3/R junction BASE COUNT 70 a 75 c 89 g 78 t ORIGIN 1 tggctgacaa gagggaaact cgctgagata gcagggactt tccacaaggg gatgttatgg 61 ggaggagccg gtcgggaaca cccactttct tgatgtataa atatcactgc atttcgctct 121 gtattcagtc gctctgcgga gaggctggca gattgagccc tgggaggttc tctccagcac 181 tagcaggtag agcctgggtg ttccctgcta gactctcacc agcacttggc cagtgctggg 241 cagagtggct ccacgcttgc ttgcttaaag acctcttcaa taaagctgcc tattttagaa 301 gtaagccagt gt //