Path: utzoo!utgpu!water!watmath!uunet!ig!daemon From: BIORELAY@BIO.CAM.AC.UK Newsgroups: bionet.molbio.seqnet Subject: SEQNET Bulletin RELAY ONLY: reply to SEQNET@UK.AC.CAM.BIO Message-ID: <6380@ig.ig.com> Date: 24 May 88 17:49:28 GMT Sender: daemon@presto.ig.com Lines: 314 From: BIORELAY@BIO.CAM.AC.UK From: SEQNET@UK.AC.CAM.PHX 24-MAY-1988 11:16 To: SEQNET Subj: Date: Tue, 24 May 88 11:15:05 BST From: SEQNET@UK.AC.CAM.PHX To: seqnet@UK.AC.CAM.BIO Message-ID: <9E900D0A61660C30@UK.AC.CAM.PHX> (Message number 2) Accepted: 11:13:07 24 May 88 Submitted: 16:16:20 21 May 88 IPMessageId: 9E8C8AC803CD4E90 From: MA11 To: seqnet Drosophila Codon Table Version 4.0 Michael Ashburner, Department of Genetics, University of Cambridge, Cambridge, England. Telephone 44-(0)223-333969 Electronic mail:ma11@uk.ac.cam.phx May 18 1988 These Tables are supplied with the understanding that they can be freely used for research, although if quoted in any publication a suitable acknowledgement (e.g. Michael Ashburner, personal communication) would be appreciated. I will automatically post new versions on the SEQNET and BIONET Bulletin Boards. These will generally be compiled whenever enough new data warrents the work. I am very happy to include new sequences that have not yet made the Sequence Data Banks, if these can be sent to me by electronic mail with sufficient data for the coding sequences to be extracted. If anyone should need the files of coding sequences that have been used to generate these tables please send me a message. Two series of Tables are included, one for "host" genes and one for orfs carried by transposable elements. For each series you have a codon table, a base composition and the names of the sequences used to compile these. By and large these sequences are taken from the EMBL, GENBANK or DAYHOFF Libraries. However some have been privately communicated to me. All sequences have been checked that they translate but many are incomplete. Hence, for example, the number of sequences is greater than the number of TER codons. The latest versions of the databanks used are EMBL V15.0 and GENBANK V55.0. // Table 1A: Codons of "host" genes: TTT 477 TCT 341 TAT 521 TGT 323 TTC 1270 TCC 1071 TAC 1187 TGC 899 TTA 147 TCA 291 TAA 61 TGA 18 TTG 687 TCG 831 TAG 28 TGG 560 CTT 362 CCT 371 CAT 521 CGT 552 CTC 605 CCC 1201 CAC 910 CGC 927 CTA 298 CCA 631 CAA 598 CGA 344 CTG 1986 CCG 814 CAG 1877 CGG 316 ATT 787 ACT 437 AAT 912 AGT 404 ATC 1417 ACC 1418 AAC 1442 AGC 909 ATA 282 ACA 399 AAA 593 AGA 209 ATG 1304 ACG 672 AAG 2304 AGG 259 GTT 577 GCT 864 GAT 1414 GGT 1034 GTC 897 GCC 2164 GAC 1333 GGC 1805 GTA 226 GCA 522 GAA 770 GGA 1233 GTG 1527 GCG 660 GAG 2491 GGG 205 Total = 52495 // Table 1A in Staden format: =========================================== F TTT 477. S TCT 341. Y TAT 521. C TGT 323. F TTC1270. S TCC1071. Y TAC1187. C TGC 899. L TTA 147. S TCA 291. * TAA 61. * TGA 18. L TTG 687. S TCG 831. * TAG 28. W TGG 560. =========================================== L CTT 362. P CCT 371. H CAT 521. R CGT 552. L CTC 605. P CCC1201. H CAC 910. R CGC 927. L CTA 298. P CCA 631. Q CAA 598. R CGA 344. L CTG1986. P CCG 814. Q CAG1877. R CGG 316. =========================================== I ATT 787. T ACT 437. N AAT 912. S AGT 404. I ATC1417. T ACC1418. N AAC1442. S AGC 909. I ATA 282. T ACA 399. K AAA 593. R AGA 209. M ATG1304. T ACG 672. K AAG2304. R AGG 259. =========================================== V GTT 577. A GCT 864. D GAT1414. G GGT1034. V GTC 897. A GCC2164. D GAC1333. G GGC1805. V GTA 226. A GCA 522. E GAA 770. G GGA1233. V GTG1527. A GCG 660. E GAG2491. G GGG 205. =========================================== TOTAL CODONS= 52495. // Table 1B: Base composition of "host" genes: T = 31458 C = 44456 Y = 0 Pyrimidine = 75914 A = 37333 G = 44242 R = 0 Purine = 81575 N = 8 Nucleotides = 157497 // Table 1C: "Host" gene sequences used for Tables 1A and 1B The numbers after the names indicate the number of codons (excluding ter but including N-terminal met); if this number if bracketed then the coding sequence is incomplete. [EMBL/GENBANK Acession numbers] M14643; alpha-tubulin-1, 450 M14644; alpha-tubulin-2, 449 M14645; alpha-tubulin-3, 450 M14646; alpha-tubulin-4, 462 M16922; beta-tubulin-2, 446 M16922; beta-tubulin-3, 448 X05893; acetyl cholinesterase, 649 X06384;Y00212; actin 5C, [137] K00670;K00671; actin 42A, [308] J01064; actin 79B, 376 K00674;K00675; actin 87E, [93] J01065; actin 88F, 376 Z00030; alcohol dehydrogenase, 256 Z00030; 3' orf to Adh, [145] X04695; a-methyl-dopa resistant (amd), 510 X04569; amylase-1, 494 X03788-X03791; Antp, 378 M14549; bicoid, [71] X04896; bsg25D, 741 M14131; C1A9 nuclear protein, 161 K01042; c-ash, [275] X05939; c-myb (13E), 697 K01960; c-ras1 (85D), 189 M10759;M10803;M10804; c-ras2 (64B), 195 X02200; c-ras3 (62B), 182 M11917; c-src (64B), 552 M16599; c-src4 (28C), 590 Y00133; calmodulin, [128] M16534;J03452; casein-hydrolase-alpha-chain, 336 M16534;J03452 casein-hydrolase-beta-chain, 215 X03062; caudal, [197] M13219; choline acetyl transferase, [728] X02947; chorion gene s15-1, 115 X02497; chorion gene s18-1, 272 X02947; chorion gene s19-1, 373 X05245; chorion gene s36, 286 X05245; chorion gene s38, 306 V00200; collagen-like gene fragments, [589] J02727; collagen-IV, [711] X05144; crumbs (EGF-like at 95F), [293] X01761; cytochrome c gene DC3, 105 X01760; cytochrome c gene DC4, 108 X05136; Deformed, 590 X05140; Delta, [200] X04426; dopa decarboxylase, 511 M14978-14982; dunce, 362 X04521; eip28/29, 255 X04024; eip40, 393 M11744; elongation factor (48D), 463 M10017; engrailed, 552 M15961; esterase-6, 548 X05138; even-skipped, 376 X00854;K01951; fushi tarazu, 413 M11254; Gapdh-1, 332 M11255; Gapdh-2, 332 J02527;K02461; glycinimide ribotide transformylase (GART), 1353 M13786; Gpdh [exon 3], [40] J01085; heat shock cognate 70C [exon 1], [68] K01296;K01297; heat shock cognate 87D [exons 1 & 2], [70] J02569; heat shock cognate 88E, [104] X04073; Histone H1, 256 Dayhoff; Histone H2A, [122] Dayhoff; Histone H2B, [118] Dayhoff; Histone H3, [122] Dayhoff; Histone H4, [72] V00209; hsp22, 174 V00210; hsp23, 186 V00211; hsp26, 208 V00212; hsp27, 213 V00213;V00214; hsp70 [87A], [347] J01104;J01105; hsp70 [87C], 641 X03810; hsp82, 717 Y00274; hunchback, 757 M13568; Insulin-like receptor protein-1, [1095] M14778; Insulin-like receptor protein-2, [300] X05273; invected, 576 X03414; Kruppel, 466 X04227; l(2)37Cc, 326 X05426; lethal(2)giant larva, 1160 V00202; larval cuticle protein-1 [44D], 130 V00203; larval cuticle protein-2 [44D], 126 V00203; larval cuticile protein-3 [44D], 112 V00204; larval visceral protein-D [44D], 508 V00204; larval visceral protein-H [44D], 522 V00204; larval visceral protein-L [44D], 505 X03872; LSP1-alpha, [70] X03873; LSP1-beta, [100] X03874; LSP1-gamma, [105] X03758; metallothionein (Mtn), 40 Y00831; mst(3)gl-9 sperm protein, 56 J02788; myosin-heavy chain, 269 M10125; myosin-light chain, 155 X04016; nicotinic acetylcholine receptor (AChR), 521 M11664; Notch, 2703 Y00043; ospsin R7 specific, 383 K02315; opsin, ninaE, 373 M12896; opsin at 91D, 373 M15762; pen#9b, 365 M11969; period, 1127 Y00402; Phosphoenolpyruvate carboxykinase, 647 M14548; paired, 613 X05076;Y00042; protein kinase C, 639 J02527;K02461; pupal cuticle protein (Gart), 184 X05016; ribosomal protein rpA1, 113 X00848; ribosomal protein rp49, 133 X05709; RNA polymerase II-140, 1123 M11798; RNA polymerase II-215, [470] Y00308; rosy, 1335 X04813; rudimentary, 2356 X01918; Sgs3, 307 J01135;J01136; Sgs4, [141] X04269; Sgs5, 163 X01918; Sgs7, 74 X01918; Sgs8, 75 Y00288; snail, 390 X04513; snake, 430 X03121; serendipity-alpha, 530 X03121; serendipity-beta, 351 X03121; serendipity-delta, 430 Y00367; superoxide dismutase, 213 K03277; tropomyosin I, T-isoform, [198] M15466; tropomyosin II, 285 X02989; trypsin-like enzyme, alpha-chain, 256 X05723;Y00206; Ubx, 389 X01802; vitelline membrane protein, [96] X02974; white, 541 Chia; yellow, 696 V00248; yolk protein-1, 459 J01157; yolk protein-2, 459 M15898; yolk protein-3, 420 Y00049; zeste, 575 // Table 2A: Codon table TE genes: TTT 366 TCT 129 TAT 264 TGT 108 TTC 200 TCC 120 TAC 230 TGC 107 TTA 351 TCA 197 TAA 1 TGA 1 TTG 195 TCG 74 TAG 0 TGG 108 CTT 216 CCT 112 CAT 187 CGT 64 CTC 104 CCC 104 CAC 165 CGC 38 CTA 199 CCA 271 CAA 396 CGA 99 CTG 105 CCG 52 CAG 160 CGG 22 ATT 463 ACT 205 AAT 620 AGT 180 ATC 175 ACC 171 AAC 403 AGC 145 ATA 447 ACA 374 AAA 888 AGA 260 ATG 199 ACG 64 AAG 282 AGG 83 GTT 181 GCT 160 GAT 330 GGT 130 GTC 106 GCC 129 GAC 305 GGC 107 GTA 188 GCA 222 GAA 566 GGA 148 GTG 113 GCG 63 GAG 227 GGG 39 Total = 12718 // Table 2A in Staden format: =========================================== F TTT 366. S TCT 129. Y TAT 264. C TGT 108. F TTC 200. S TCC 120. Y TAC 230. C TGC 107. L TTA 351. S TCA 197. * TAA 1. * TGA 1. L TTG 195. S TCG 74. * TAG 0. W TGG 108. =========================================== L CTT 216. P CCT 112. H CAT 187. R CGT 64. L CTC 104. P CCC 104. H CAC 165. R CGC 38. L CTA 199. P CCA 271. Q CAA 396. R CGA 99. L CTG 105. P CCG 52. Q CAG 160. R CGG 22. =========================================== I ATT 463. T ACT 205. N AAT 620. S AGT 180. I ATC 175. T ACC 171. N AAC 403. S AGC 145. I ATA 447. T ACA 374. K AAA 888. R AGA 260. M ATG 199. T ACG 64. K AAG 282. R AGG 83. =========================================== V GTT 181. A GCT 160. D GAT 330. G GGT 130. V GTC 106. A GCC 129. D GAC 305. G GGC 107. V GTA 188. A GCA 222. E GAA 566. G GGA 148. V GTG 113. A GCG 63. E GAG 227. G GGG 39. =========================================== TOTAL CODONS= 12718. // Table 2B: Base composition TE genes: T = 9774 C = 7350 Y = 0 Pyrimidine = 17124 A = 14591 G = 6439 R = 0 Purine = 21030 N = 0 Nucleotides = 38154 // Table 2C: TE genes used for Tables 2A and 2B: [EMBL/GENBANK Accession numbers] X01472; 17.6 element X03431; 297 element X04132;X03733; 412 element X02599; copia element [Saigo] V00246; FB4 X03734; gypsy element X01748; HB1 X04705; hobo Finnegan I element O'Hare; P element X01747; transposon HB2 X02600; virus like particle RNA (VLP H-RNA) //