Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!elroy.jpl.nasa.gov!sdd.hp.com!hplabs!hpl-opus!hpcc05!hpcuhb!hpcuhe!pi
From: pi@hpcuhe.cup.hp.com (Paul Ilgenfritz)
Newsgroups: comp.arch
Subject: Re: Animal architecture (was: Re: small instructions)
Message-ID: <32580035@hpcuhe.cup.hp.com>
Date: 19 Jun 91 16:18:34 GMT
References: <1991Jun17.133846.29956@ns.network.com>
Organization: Hewlett Packard, Cupertino
Lines: 39


> It's notable that the fundamental encoding scheme for the genetic code
> is extremely simple: four instructions (ATCG).  All else is accomplished
> in software.

I think it would be more accurate to say that there are 22 or 23
instructions in the genetic code.  The actual encoding of DNA is extremely
fascinating and I'll give a simple description here for those who are
interested.

First of all, ACTG are not the instruction codes but are more like the
basic bits of information much like "1" and "0" in a computer.  For years,
scientists did not think the nucleus was important to cell function
because it contained only these four simple molecules along with a few
others.  How could such simple chemistry direct the complex function of a
cell?

The answer came when the code was cracked.  It directs the building of
proteins from amino-acids.  This occurs when a section of DNA "un-zips"
and an RNA chain copies one side of the DNA strand.  The RNA travels
out of the nucleus to ribosomes which read it.  A protein is assembled
serially from the amino-acid sequence which is encoded in the strand.

This encoding is based on triplets in the DNA sequence.  Since there are
four possible values for each position on a triplet, there are 64 possible
triplets.  Each triplet, except for three, corresponds to an amino-acid.
There are only 20 amino-acids so many triplets are redundant.  The three
triplets with no associated amino-acid serve as start and stop markers for
a sequence.  Hundreds of triplet codes (called codons) in an exact
sequence are required to make a given protein.  

These 22 or 23 useful codons are analagous to computer instructions since
a complex sequence produces a useful product.  The raw ATCG sequence makes
no sense unless grouped into the codons much like a stream of bits would
make no sense unless grouped into the instruction opcodes.

(By the way, the penalty for a bug in the DNA sequence can be severe--it
takes just one amino-acid switch in the hemoglobin molecule to cause
sickle cell anemia.)