Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!watmath!clyde!rutgers!husc6!seismo!mcnc!ecsvax!emigh From: emigh@ecsvax.UUCP Newsgroups: sci.bio Subject: Re: question Message-ID: <2840@ecsvax.UUCP> Date: Mon, 30-Mar-87 13:27:50 EST Article-I.D.: ecsvax.2840 Posted: Mon Mar 30 13:27:50 1987 Date-Received: Wed, 1-Apr-87 01:15:29 EST References: <11189@teknowledge-vaxc.ARPA> <978@aecom.UUCP> <3310@udenva.UUCP> Reply-To: emigh@ecsvax.UUCP (Ted Emigh) Organization: UNC Educational Computing Service Lines: 35 In article <3310@udenva.UUCP> agranok@udenva.UUCP (Alexander Granok) writes: >(Craig Werner) writes: >>(Randy Burns) writes: >>> I was wondering roughly how many 'bytes' of information are contained >>> within human chromosomes? >> Hence, if a byte is a base pair, that's your answer, although >>only two bits are required to specify a base, ergo a 'byte' could >>actually be a tetranucleotide, but most sequences are stored as >>letters (ATCG). >The whole arguement gets caught up in definitions, here. I would consider a >bit to be a base pair, and a byte to be the set of three that encodes for one >amino acid. Instead of eight bits to a byte, there are three. After all, one >base pair by itself doesn't do much good. But, if a base pair is a bit, then In the same way, a byte doesn't do much good in floating point arithmetic.:-) The problem, of course, is that not all the genome is used for the message in polypeptide chains. There are noncoding regions (particularly in eukaryotic organisms); rRNAs (often many thousand copies of each gene); tRNAs (again with lots of copies except in the prokaryotes and organelles); etc. In humans, it is estimated that only 1-2% of all the DNA actually encodes for amino acids. This is mostly a problem of semantics. If we wish to use "byte" as the smallest unit of meaningful information, then the nucleotide is the byte. The addition of the complementary base (to make a base pair) adds no additional information, so the base pair could be considered as the "byte" as well. -- Ted H. Emigh Genetics and Statistics, North Carolina State U, Raleigh NC USENET: emigh@ecsvax.uucp DOMAIN: emigh%ecsvax.ncecs.edu ARPA: ecsvax!emigh@mcnc.org BITNET: NEMIGH@TUCC Distribution to monotremes and flightless waterfowl **RESTRICTED**