Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!ccu.umanitoba.ca!frist From: frist@ccu.umanitoba.ca Newsgroups: bionet.molbio.bio-matrix Subject: Re: Oh foolish supporters of genome sequencing Message-ID: <1991Feb12.180803.1695@ccu.umanitoba.ca> Date: 12 Feb 91 18:08:03 GMT References: <9102111908.AA07006@genbank.bio.net> <5695@husc6.harvard.edu> Organization: University of Manitoba, Winnipeg, Canada Lines: 133 In article <5695@husc6.harvard.edu> Ellington@Frodo.MGH.Harvard.EDU (Deaddog) writes: > >I am sure that I will see justification for the project at some point, >but so far none has appeared. I am truly mystified: given the ability to >clone genes by a variety of methods (probing with oligos based on protein >sequences, complementation, hybridization with homologous sequences, >panning, subtraction cloning, etc.) I do not see a need to sequence >the human genome. I do not see what information will be gained that cannot >be garnered by other, more directed means. In terms of genetic disease, >for example, the search for a given gene would still seem to be a directed >search. > ....deleted stuff about genome project competing for funds w/other science ... deleted stuff about sequencing at random > >do, and we don't have to worry about the 95% of the genome that is essentially >junk;therefore, the sequence of X's genome is just a way to get at a glut of >information that we can immediately interpret"), these arguments do not seem >to apply to humans. But how do you KNOW it's junk? There are still lots of chromosomal functions that repetitive DNA could play, but we wouldn't have been able to detect for lack or a global picture. Such functions might include higher-order packaging of chromatin during chromosome condensation, facilitation of chiasma formation, attachment to the nuclear matrix or the definition of functional "domains" [Bodnar, 1988]. It should be obvious now that gene expression is _not_ just a question of having the right promoter or enhancer. These are clearly only one component of a much more complex mechanism for genome function that is made possible by the deliberate organization of the chromatin in the interphase nucleus. The importance of repetitive DNA is further underscored by evolutionary studies such as those done by Narayan with the plant genera Clarkia, Nicotiana, Lathyrus and Allium [Narayan (1985), Narayan (1982)]. Basically, Narayan has demonstrated that, in a wide range of genera examined, interspecific differences in genome sizes are in discrete increments, rather than being a continuous variation. Similarly, Flavell and colleagues have used hybridization kinetics to demonstrate that specific subsets of interspersed repetitive sequences can be selectively lost or gained as spec- iation occurs [Flavell et al. 1977]. I am a firm believer in "junk DNA" and I do think that a lot of the genome is junk. But not all of that 95%,` or whatever figure you wish to use. The Genome Project is an exploratory mission, like Darwin's voyage on the H.M.S. Beagle. Although Darwin was perhaps perceived as an unimportant crewman on this exploratory trip, the wealth of data obtained by Darwin served as his 'database' for thinking about biology from a global perspective. At first, his work was merely 'bookkeeping'. Who really cared how many types of finch were on each island? Nonetheless, it was these years of observation that made it possible for Darwin to obtain the global perspective necessary for the formulation of the most fundamental theory in biology: the theory of evolution. > >Even in this form, though, it is not sequencing "the genome" that is >important, but sequencing "all the genes that we know are important." It >seems as though anthrocentricity is driving this project, rather than a true >quest for scientific knowledge. Again, how do you KNOW which genes are important? At the present time, the sequence databases are largely biased towards highly-expressed genes, because those are the ones most likely to be detected and cloned. These are important, in the sense that the cell needs thousands of copies of their transcripts. But it is often the case that <25% of the mass of the mRNA represents >95% of the sequence diversity. There are literally thousands of genes for which only 1-10 copies of the transcript are present in the cell [Okamuro and Goldberg, 1989]. It is only by obtaining a clear view of what the cell does with these rare transcripts that we will have a complete understanding of how gene expression results in a differentiated organism. >Consider: I remember when the sequence of all of SV40 was first >determined; I also remember when the chloroplast genome was completed. What >new lines of research have been opened up by this information? New lines >mind you, that wouldn't have been available had not the sequence of the whole >genome been available. For example, I am grateful for the sequences of >additional Group I introns from the chloroplast genome, but a more diverse >range of sequences has become available from directed searches of many >different genomes. It took Darwin more than 20 years after his voyage to understand the meaning of the data he had collected, and publish Origin of Species. Getting the data is the easy part, and will take a finite time. Understanding it will take generations. I like to compare molecular biology to planetary astronomy. There is so much data from space probes (eg. Ranger, Surveyor, Mariner, Viking, Voyager, Pioneer) that it is reasonable to do an entire PhD thesis without ever touching a telescope. Sadly, much of this data is virtually inaccessible because NASA has not been able to do anything with it other than store the tapes in vaults. The advances that our database managers are making with biological sequence data should serve as a source of optimism that the same thing will not happen to newly-obtained sequence data. > >And if you defer to thinking about the sequence of the human genome as a >tool, then it is a very, very expensive tool. And again I would suggest >that finding the mechanism of one gene which causes MS is worth much more >than knowing the sequences of all the genes together (when you don't know >what the genes do and still have to go back and find out). Yes, it is cheaper to clone one gene than to sequence the human genome. But there are several hundred known genetic diseases in humans. With the entire genome sequenced, we will have all of them. It seems likely to me that sequencing the genome will be cost effective, as compared to hundreds of separate projects to clone individual genes. >Non-woof > References: Flavell, R.B., Rimpau, J. and Smith, D.B. (1977) Repeated sequence DNA rel- ationships in four cereal genomes. Chromosoma 63:205-222. Narayan, R.K.J. (1985) Discontinuous DNA variation in the evolution of plant species. (Indian) J. Genet. 64:101-109. Narayan, R.K.M. (1982) Discontinuous DNA variation in the evolution of plant species: the genus Lathyrus. Evolution 36:877-891. Okamuro, J.K. and Goldberg, R.B. (1989) Regulation of plant gene expression in Stumpf, P.K. and Conn, E.E. (eds.) The Biochemistry of Plants Vol 15 "Molecular Biology". Academic Press. Query: Is Deaddog (Non-woof) really playing devil's advocate here, challenging supporters of the genome project to justify it in a better way than has been done up to now? Further query: Was Darwin doing 'real science' during his voyage on the Beagle, or did that come only later, as he tilled the 'vegetable mould' in his garden, pondering the meaning of 'certain facts in the distribution of organic beings inhabiting South America'.? =============================================================================== Brian Fristensky | What can literature do against the pitiless Department of Plant Science | onslaught of naked violence? Let us not for- University of Manitoba | get that violence does not and cannot flourish Winnipeg, MB R3T 2N2 CANADA | by itself; it is inevitably intertwined with frist@ccu.umanitoba.ca | LYING... Lies can stand up against much in Office phone: 204-474-6085 | world, but not against art. FAX: 204-275-5128 | Alexander Solzhenitsyn, NOBEL LECTURE ===============================================================================