Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!ames!ucbcad!ucbvax!hplabs!hao!boulder!eddy From: eddy@boulder.UUCP Newsgroups: sci.bio Subject: Re: information content of DNA Message-ID: <918@sigi.Colorado.EDU> Date: Sun, 12-Apr-87 16:17:54 EST Article-I.D.: sigi.918 Posted: Sun Apr 12 16:17:54 1987 Date-Received: Mon, 13-Apr-87 23:43:59 EST References: <2840@ecsvax.UUCP> <11189@teknowledge-vaxc.ARPA> Sender: news@sigi.Colorado.EDU Reply-To: eddy@boulder.Colorado.EDU (Sean Eddy) Organization: University of Colorado, Boulder Lines: 55 In article <569@cpocd2.UUCP> howard@cpocd2.UUCP (Howard A. Landman) writes: >In article <891@sigi.Colorado.EDU> eddy@beagle.Colorado.EDU (Sean Eddy) writes: >>No, John, the point Dizzy was making was that while cases of overlap exist, >>they are 1)very rare 2)very short and 3)only in two of the three reading >>frames. > >I'm sure there was an article in Sci Am recently about a virus which had a >very short segment of triple overlap. This makes point 3 false. Sorry, I stand corrected. Thanks; I didn't know about the polyoma example, though I should have. I also take back number two. Did a little reading on phiX174 (Nature 264: 34-41, 1976), which is a bacteriophage of E. coli. Apparently gene E which codes for a host cell lysis protein is located completely within the coding sequence for gene D, which is necessary for replication. No small overlap there, we're talking two complete genes. Pardon me while I extract my foot from my mouth. >>And remember that your original point was that any given sequence >>potentially represents 6 (!) codings, not just two. Dizzy rightly >>replied that this is a good approximation to impossible. > >Any sequence not containing a terminator does, in some sense, code for 6 >proteins. It would perhaps be more accurate to say that the probability of >all 6 of these proteins being at all functional (or, less likely, actually >produced by an organism) is very close to zero. The exact probability >is a negative exponential of the sequence length, which we could approximate >via information theory and statistics about protein mutability vs. function. >Anyone have any relevant statistics? But this I won't buy yet. What is meant by a terminator here? To me, 'terminator' refers to a transcriptional terminator. The regulatory signals for protein translation are different. Having no transcription stop site should, to my mind, make little difference to protein translation. Also, something to keep in mind is that translation is a very controlled system, for good reason. Protein synthesis costs a hell of a lot of energy. A cell that wantonly made all 6 possible proteins from a sequence would quickly be selected against in favor of a cell that only produced the functional one. - Sean Eddy - Dept. of Molecular, Cellular, Developmental Biology - Univ. of Colorado, Boulder; Boulder, CO 80309 - - "Science has done some wonderful things, but I'd rather be happy - than right." - "Are you?" - "Well, I'm afraid that's where it all falls down." - - from Hitchhiker's Guide to the Galaxy