Path: utzoo!mnetor!uunet!yale!cmcl2!phri!roy From: roy@phri.UUCP (Roy Smith) Newsgroups: sci.bio Subject: Re: Abbreviations for ambigious bases Message-ID: <3200@phri.UUCP> Date: 18 Mar 88 18:37:52 GMT References: <3193@phri.UUCP> <16606@beta.UUCP> Reply-To: roy@phri.UUCP (Roy Smith) Organization: Public Health Research Inst. (NY, NY) Lines: 35 In response to a query of mine about Rich Roberts's ambigious base notation, dd@beta.UUCP (Dan Davison) writes: > It's the ambiguous base code developed by the MOLGEN project at SU > SUMEX-AIM.STANFORD.EDU back in the dawn of time, 1979-1980. It bears no > resemblence to the Staden or IUPAC codes. I did a bit more research on this topic and came up with the following paper: %A Athel Cornish-Bowden %T Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984 %J Nucleic Acids Research %D 1985 %V 13 %P 3021-3030 This paper includes a longish list of references to other attempts at standardizing the code, and provides some arguments as to why the scheme he presents (the IUPAC scheme) is more mneumonic that any other. For example, W={A,T} and S={C,G} because A-T pairs are Weak and C-G pairs are Strong; M={A,C} and K={G,T} because A and C have aMido groups in chemicaly similar positions while G and T have Keto groups in those positions. I'm fully aware how hard it is to change over from one standard to another, especially after using the old one for so many years. On the other hand, I think it's pretty much agreed that IUPAC is the final authority when it comes to chemical nomenclature; to insist on using some other naming system just doesn't make sense. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016