Path: utzoo!attcan!uunet!ncrlnk!ncrcae!hubcap!gatech!rutgers!apple!bionet!agate!ucbvax!ucsfcgl!srp From: srp@cgl.ucsf.edu (Scott R. Presnell%Langridge) Newsgroups: comp.ai Subject: Re: Information Capacity of Human Genome Keywords: Intelligence Message-ID: <11234@cgl.ucsf.EDU> Date: 11 Nov 88 17:04:56 GMT References: <1651@ndsuvax.UUCP> <349@uceng.UC.EDU> <42136@yale-celray.yale.UUCP> <393@uceng.UC.EDU> <40841@aero.ARPA> Sender: daemon@cgl.ucsf.edu Reply-To: srp@cgl.ucsf.edu (Scott R. Presnell) Organization: UC San Francisco, Pharmaceutical Chemistry Lines: 49 In article josh@klaatu.rutgers.edu (J Storrs Hall) writes: >In article <393@uceng.UC.EDU> dmocsny@uceng.UC.EDU (daniel mocsny) writes: >>The information content of the human genome is ~750 MB, of which >>a sizable fraction determines our basic brain structure. > >... of which a fairly *small* fraction determines brain structure. >I have read estimates on the order of a few megabytes (smaller than >Common Lisp!). Of course, this is very compressed, like a fractal >description of an image... > >--JoSH I hesistate to redefine a byte, so I think the best way to quantify the situation is to use original units. The human genome includes 2.3e+9 base pairs. For the sake of simplicity lets treat the number as 5.0e+9 bases as there may be situations where the two strands perform different functions. It has been estimated through areguments of relative complexity of organisms that the human genome probably contains about 100e+3 genes. If we make a big assumtion and say that the average gene is 1000 bases after being appropriately processed, that leads us to 100e+6 bases required for genes, Therefore ~%5 of the genome is used for genetic information in the form of genes (or proteins). This is probably a underestimate of the actual information required for an organism to function. Stepping out onto a limb: As for what fraction actually determines the structure of our brain, well, let's just say that the brain is only one organ or cell type (out of say 20?) that our cells must differentiate into. So maybe 0.3% of the genome determines the structure the brain? You get the idea... As for the amount of memorey required to store the genome: The four bases can be represented by 2 bits. Furthermore, only one strand need be stored since the other strand can be calcuated from it. That means we need 5.0e+9 bits or 625e+6 bytes to store the sequence, as calculated above. Cheers, Scott Presnell +1 415 476 5326 Dept. of Pharmaceutical Chemistry Univ. of Calif. at San Francisco (UCSF), San Francisco, CA. 94143 Internet: srp@cgl.ucsf.edu UUCP: ucbvax!ucsfcgl!srp Bitnet: srp@ucsfcgl.bitnet