Path: utzoo!utgpu!watserv1!watmath!iuvax!bionet!LANL.GOV!pgil%histone
From: pgil%histone@LANL.GOV (Paul Gilna)
Newsgroups: bionet.molbio.genbank
Subject: Time lag for sequence appearence
Message-ID: <9001171536.AA02450@histone.lanl.gov>
Date: 17 Jan 90 15:36:17 GMT
Sender: daemon@genbank.BIO.NET
Lines: 80


Rupert de Wachter (RRNA@ccv.uia.ac.be) from the University of 
Antwerp (Belgium) writes:

<text deleted>

     I would like to have some more information about a few things:
- can sequences automatically be retrieved using this same e-mail number or is
  there another access to the file server?
- can we ask on-line help?
- how would a retrieval using the accession number of a particular sequence,
  for example M22441, look like?
- does a sequence appear on the server as soon as it is mentioned in a
  publication or is there any delay?


Dear Dr. de Wachter,

Our colleagues at IntellGenetics will handle your inquiries regarding the 
online system, I should like to address the final question in your list;

"- does a sequence appear on the server as soon as it is mentioned in a
  publication or is there any delay?"

There are three principal sources of nucleotide sequence data that are handled
by the data entry and annotation staff here at LANL; (1), the printed 
publication, where data are manually entered by our data entry crew, 
(2), direct author  submission, where the sequence data and associated 
bibliographic and biological information are provided directly to us by the 
scientist, and  (3), incorporation of data from EMBL and DDBJ releases.

In the former case (extracted from publication), the time taken for
the data to appear on the on-line system is a function of the time taken to
process a particular article through our data entry and annotation
staff.  As soon as our staff here are completed with an entry 
it is immediately passed to the servers both at Intelligenetics 
and at EMBL (as well as Houston).  Currently we are averaging a six week turnaround from the  date of publication to the appearence 
of a fully annotated "entry" on the on-line system.  This is in contrast
to the 13 month average for this source of data two years ago.

In regard to the second source of data, i.e., from the author, if the
data are received in computer-readable form, they should appear on the
servers in fully annotated form within two weeks or less.  If received
in hard-copy form, they go through the process described above.  The fact
that we receive the bulk of our direct submissions AHEAD of publication,
means that the data appear on the on-line systems and servers before
or close to the date of publication. We often have the data in our hands far enough in advance of publication to have errors that we spot in our routine
integrity checking procedures corrected by the author before publication;
in a sense we provide a peer-review function for the sequence data itself, a review not often carried out in the conventional editorial review process.

If the data submitted to us are associated with a manuscript that has 
yet to be accecpted by the journal editorial  process, they will be 
classified as "unpublished" ( this removes complications
which might occur if the journal chose not to accecpt the manuscript):
the entry will be updated with the correct citation once we spot or are 
notified of publication.

We now receive about 65-70% of our data direct from the community.  About
70-80% of that are in electronic form, whether by e-mail or on floppy disc.

While we here at Los Alamos currently incorporate data from EMBL and DDBJ
releases within two weeks of receipt of the tapes, EMBL in addition 
supply new data to the GenBank on-line server on a similar daily basis.

Finally for all submissions, we offer the choice to the author of holding
that data confidential until such time as we are given permission to
release the data or they are published.  In some cases, there is a time
lag before we spot the appearence of data in the literature and link
this to data we are holding as confidential, but this should not normally 
exceed two weeks if the data appear in journals that we regularly scan.

I hope this answers your question.

Regards,

Paul Gilna Ph.D.,
Biology Domain Leader
GenBank, Los Alamos.