Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!spool.mu.edu!uunet!bionet!ACADVM1.UOTTAWA.CA!SBAIRD%UOTTAWA From: SBAIRD%UOTTAWA@ACADVM1.UOTTAWA.CA (Stephen Baird) Newsgroups: bionet.software Subject: Sequence Reading Message-ID: <9104042142.AA16745@genbank.bio.net> Date: 4 Apr 91 21:42:45 GMT Article-I.D.: genbank.9104042142.AA16745 Sender: daemon@genbank.bio.net Lines: 29 >From: Bruce Roe >Regarding the question regarding reading sequences in various formats. >--------------------------- cut here ----------------------------------- >I've updated the sequence reformatter of mine called ReadSeq. This >program comes as C source code that is suitable for Unix, VMS, MS-DOS, or >other command-line systems. >Readseq reads and writes nucleic/protein sequence in these formats: > Stanford/IG, Genbank, NBRF, EMBL, UWGCG, DNA Strider, Fitch, > Pearson, Zuker, Olsen, Phylip v3.2, Phylip v3.3, and Plain text >Data files may have multiple sequences. Software developers are >encouraged to use these routines rather than devise their own obscure >formats. The pascal version of readseq is now out-of-date. I'd prefer not reformating sequences as I use them. I'd prefer to have some program translate the sequence just for the program doing the analysis and then spit back the resulting sequence in the form I started with. If one uses several different programs which use different formats, there can be a resulting hodgepodge collection of different files of the same sequence with various changes or additions to it. A modular program (like readseq Ibelieve) which would filter the format and leave the comments intact (i think) would be useful. Modules could be added for the different programs for the different ways they open a sequence file. Is this asking too much? Stephen Baird Molecular Genetics Children's Hospital of Eastern Ontario sbaird@acadvm1.uottawa.ca