Path: utzoo!mnetor!uunet!yale!cmcl2!phri!roy
From: roy@phri.UUCP (Roy Smith)
Newsgroups: sci.bio
Subject: Re: Abbreviations for ambigious bases
Message-ID: <3200@phri.UUCP>
Date: 18 Mar 88 18:37:52 GMT
References: <3193@phri.UUCP> <16606@beta.UUCP>
Reply-To: roy@phri.UUCP (Roy Smith)
Organization: Public Health Research Inst. (NY, NY)
Lines: 35


	In response to a query of mine about Rich Roberts's ambigious base
notation, dd@beta.UUCP (Dan Davison) writes:

> It's the ambiguous base code developed by the MOLGEN project at SU
> SUMEX-AIM.STANFORD.EDU back in the dawn of time, 1979-1980.  It bears no
> resemblence to the Staden or IUPAC codes.  

	I did a bit more research on this topic and came up with the
following paper:

%A Athel Cornish-Bowden
%T Nomenclature for incompletely specified bases in nucleic acid sequences:
recommendations 1984
%J Nucleic Acids Research
%D 1985
%V 13
%P 3021-3030

	This paper includes a longish list of references to other attempts
at standardizing the code, and provides some arguments as to why the scheme
he presents (the IUPAC scheme) is more mneumonic that any other.  For
example, W={A,T} and S={C,G} because A-T pairs are Weak and C-G pairs are
Strong; M={A,C} and K={G,T} because A and C have aMido groups in chemicaly
similar positions while G and T have Keto groups in those positions.

	I'm fully aware how hard it is to change over from one standard to
another, especially after using the old one for so many years.  On the
other hand, I think it's pretty much agreed that IUPAC is the final
authority when it comes to chemical nomenclature; to insist on using some
other naming system just doesn't make sense.
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016