Path: utzoo!utgpu!watserv1!watmath!uunet!bionet!lhc!lhc!hunter From: hunter@work.nlm.nih.gov (Larry Hunter) Newsgroups: bionet.molbio.genome-program Subject: Re: General Reference Message-ID: Date: 10 Dec 90 18:03:43 GMT References: <1990Dec10.005756.2694@agate.berkeley.edu> <39971@ucbvax.BERKELEY.EDU> Sender: usenet@nlm.nih.gov (usenet news poster) Organization: National Library of Medicine Lines: 53 In-Reply-To: aoki@postgres.Berkeley.EDU's message of 10 Dec 90 08:19:39 GMT In response to Tzi-cker Chiueh's query about what a computer scientist can do for/with genomic data, Paul Aoki writes: For some reason, AI techniques aren't very popular -- people like brute-force, optimal-cost methods. Parallel programming is popular, since the dynamic programming computations are easily parallelized (one group is using a Connection Machine, another uses a Sequent, yet another uses the ICL DAP array processor). Most database technology flies right out the window because the databases are still small enough that a system that goes to disk a lot will have horrible performance relative to more ad-hoc, main-memory- oriented search software. Although Aoki's opinions are a helpful beginning, I have to take issue with a couple of points. There is actually quite a bit of AI being done in genome-related areas, and the requirements of genome-related databases (not only sequence, but protein structure, coarser grained genetic maps, etc.) place significant pressure on existing database technologies. As for AI, I can point to more than 100 people listed in a database of ai & molecular biology researchers that I maintain, doing work in very diversse areas. I have an article which surveys some of this work (based on the talks given at 1990 AAAI Spring Symposium on AI & Molecular Biology) which will appear in the next issue of the AI Magazine. You may note that the predicted secondary structure of the principle neutralization determinant of HIV-1 on the cover of the 24 August 1990 issue of Science was generated by a neural network. BTW, the AI/MB database, which contains information on research interests and current projects of many people from around the world, is publicly available. It can be obtained by anonymous ftp from the host lhc.nlm.nih.gov in the directory /pub/aimb-db, or by request to the University of Houston email server. Finally, although I am not an expert in database issues, I would suggest contacting the National Center for Biotechnology Information to find out about work in biosequence and other databases. You can download information from ncbi.nlm.nih.gov using anonymous ftp, or send mail to federhen@ncbi.nlm.nih.gov. Good luck. There are many good computer science problems involved in genome work; we need good computer scientists to attack them. -- Lawrence Hunter, PhD. National Library of Medicine Bldg. 38A, MS-54 Bethesda. MD 20894 (301) 496-9300 (301) 496-0673 (fax) hunter@nlm.nih.gov (internet) hunter%nlm.nih.gov@nihcu (bitnet/earn)