Path: utzoo!attcan!uunet!snorkelwacker!bionet!CU.NIH.GOV!CZJ From: CZJ@CU.NIH.GOV Newsgroups: bionet.molbio.genbank Subject: GenBank software Message-ID: <9008081410.AA02507@alw.nih.gov> Date: 8 Aug 90 14:12:16 GMT Sender: daemon@genbank.BIO.NET Lines: 53 As one of the Project Officers of the GenBank Contract, I thought it would be appropriate to respond to some of the issues raised by Dan Davison in his "I must have been out of the room" messages. The history of the new features table goes back several years. I remember the first GenBank Advisors's meeting I attended in October 1985. The subject that came up then, and at every advisors' meeting hence, was the difficulty in translating an entry from EMBL to GenBank format and vice versa. This difficulty had an obvious impact in delaying the incorporation of EMBL data into GenBank. The major impediments to automatically translating an EMBL entry into a GenBank were incompatibilities in the features table formats. Neither GenBank nor EMBL staffs felt that they represented features adequately. Therefore, a series of meetings, which began with a workshop involving members of the scientific community, was held to design a features table format that could better represent the complexities of biological features. The result is the new features table effective with release 64. I would point out that there was plenty of advance warning. This warning included several example entries. I am sure that David Benton can comment better on the advantages of the new format. Rather I would like to discuss briefly Dan's wish to have GenBank personnel write code that parses the new features table and distribute it free. First I would begin by reminding the community that the purpose of GenBank has been to collect and distribute data. The development of sofware was left to the community. The success of this policy can be seen by the development of several excellent packages both by commercial firms and the public. If you are privy to the problems of the Cambridge Small Molecule Crystallographic database, you know the problems of linking software development to database distribution. Thus it is consciously beyond the scope of the GenBank contract to develop software to parse the features table. Despite this position, it is obvious that some software, i.e. Jim Fickett's program to translate GenBank is in use by GenBank and will have to be modified to parse the new features table. Although it is perfectly feasible to distribute this program, the problem for GenBank is one of support. That is some one has to answer the many questions about installation, why the program will not work on a specific machine, modifications,etc. There is also the question of updates--that is code used internally is often modified to meet specific purposes often associated with new releases. The bottom line is that the responsible distribution of "free" software can be an expensive proposition. Right now the only way for the GenBank to meet this obligation would be to cut down on services elsewhere. I hope these comments have been helpful. One of the gratifying things to me has been the community spirit among GenBank users and the willingness to distribute software that takes advantage of the GenBank resource. I trust this will continue in the future. Jim Cassatt