Path: utzoo!utgpu!watserv1!watmath!att!emory!sol.ctr.columbia.edu!bronze!cricket.bio.indiana.edu!gilbertd From: gilbertd@cricket.bio.indiana.edu (Don Gilbert) Newsgroups: bionet.molbio.genbank Subject: how to submit discontinuous sequence? Message-ID: <1991Feb1.013922.26504@bronze.ucs.indiana.edu> Date: 1 Feb 91 01:39:22 GMT Sender: news@bronze.ucs.indiana.edu (USENET News System) Organization: Biology, Indiana University - Bloomington Lines: 21 Is there a consensus view on the proper way to enter discontinuous sequences to GenBank? An otherwise continuous length of molecule contains regions which were not sequenced, of many bases in length. Options seem to be a) enter in databank under one accession number, with feature notations indicated where regions with no data exist. Drawback: users can miss feature info and incorrectly use such data as a continuous sequence. b) enter in databank under separate accession numbers for each continuous region. Drawback: sequential nature of data is obscured by separate entries. c) enter as one accession, with unsequenced regions (whose size is known, I believe, by alignment with related sequences) indicated with "N" or other symbol. Drawback: the N symbol may not be appropriate. -- Don -- Don Gilbert gilbertd@cricket.bio.indiana.edu biocomputing office, biology dept., indiana univ., bloomington, in 47405