Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!elroy.jpl.nasa.gov!sdd.hp.com!hplabs!hpl-opus!hpcc05!hpcuhb!hpcuhe!pi From: pi@hpcuhe.cup.hp.com (Paul Ilgenfritz) Newsgroups: comp.arch Subject: Re: Animal architecture (was: Re: small instructions) Message-ID: <32580035@hpcuhe.cup.hp.com> Date: 19 Jun 91 16:18:34 GMT References: <1991Jun17.133846.29956@ns.network.com> Organization: Hewlett Packard, Cupertino Lines: 39 > It's notable that the fundamental encoding scheme for the genetic code > is extremely simple: four instructions (ATCG). All else is accomplished > in software. I think it would be more accurate to say that there are 22 or 23 instructions in the genetic code. The actual encoding of DNA is extremely fascinating and I'll give a simple description here for those who are interested. First of all, ACTG are not the instruction codes but are more like the basic bits of information much like "1" and "0" in a computer. For years, scientists did not think the nucleus was important to cell function because it contained only these four simple molecules along with a few others. How could such simple chemistry direct the complex function of a cell? The answer came when the code was cracked. It directs the building of proteins from amino-acids. This occurs when a section of DNA "un-zips" and an RNA chain copies one side of the DNA strand. The RNA travels out of the nucleus to ribosomes which read it. A protein is assembled serially from the amino-acid sequence which is encoded in the strand. This encoding is based on triplets in the DNA sequence. Since there are four possible values for each position on a triplet, there are 64 possible triplets. Each triplet, except for three, corresponds to an amino-acid. There are only 20 amino-acids so many triplets are redundant. The three triplets with no associated amino-acid serve as start and stop markers for a sequence. Hundreds of triplet codes (called codons) in an exact sequence are required to make a given protein. These 22 or 23 useful codons are analagous to computer instructions since a complex sequence produces a useful product. The raw ATCG sequence makes no sense unless grouped into the codons much like a stream of bits would make no sense unless grouped into the instruction opcodes. (By the way, the penalty for a bug in the DNA sequence can be severe--it takes just one amino-acid switch in the hemoglobin molecule to cause sickle cell anemia.)