Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!olivea!apple!bionet!FRODO.MGH.HARVARD.EDU!KELLOGG
From: KELLOGG@FRODO.MGH.HARVARD.EDU
Newsgroups: bionet.genome.arabidopsis
Subject: Arabidopsis evolution
Message-ID: <910501120128.45d@Frodo.MGH.Harvard.EDU>
Date: 1 May 91 16:01:28 GMT
Sender: daemon@genbank.bio.net
Lines: 62


    I'm not sure I'd be all that worried about the weird placement of
Arabidopsis on the cytochrome c tree, and I certainly wouldn't question 
the utility of Arabidopsis as a model system.  There are several reasons:  
    1.  Trees based on cytochrome c tend to be peculiar in general.  For 
example, if you check the trees published by Syvanen et al. (JME, 1989, 
ref. in the Kemmerer paper) you can find all sorts of oddities.  In their 
fig. 5 they show a minimal eukaryotic cyctochrome c tree - the two fish 
don't come out together; on the vertebrate clade, the frog is the most 
basal group, more primitive even than the carp; sesame (a dicot) comes out 
with rice (a monocot); the grasses (wheat, maize and rice) form a 
paraphyletic rather than monophyletic group.  
    2.  The tree published by Kemmerer et al. is not particularly robust.  
The numbers beside the branches are bootstrap replicates, not mutations.  
Thus the only groupings that appear in at least half of the bootstrap 
samples are the fungus + Arabidopsis clade and the higher plant clade.  
Those who choose to interpret bootstraps samples in terms of statistical 
significance would say those groupings are significant at the 50% level - 
i.e. not very significant.  If you want to use a 95% level, then only the 
Neurospora + Arabidopsis grouping is significant and the rest of the tree 
is phylogenetically meaningless.  The statistical interpretation of 
bootstraps is under heavy fire at the moment, so I wouldn't push it too 
far; however, if the animal grouping only appears in 19 out of 50 
replicates, it doesn't seem overly convincing.
    3.  Based on the phenograms, the Arabidopsis sequence is not 
particularly similar to that of Neurospora - the similarity is much less 
than that among the higher plants.  So it may not be much like a higher 
plant cytochrome, but it isn't all that much like a Neurospora cytochrome 
either.  Arabidopsis is on a long branch, and it has been extensively 
documented that "long branches attract" in phylogenetic analyses (the 
so-called Felsenstein zone).  The best way to correct that problem is to 
increase the sampling density around the problematic taxon.
    
    The implication of 1-3 is that cytochrome c may not be much use as 
indicator of relationship.  On the other hand, the fact that similarities 
in the molecules do not appear to be determined primarily by phylogeny 
means that there must be some really interesting molecular biology going 
on - this gets back to the fitness landscape described by Ulrich Melcher.  
Possibly selection has been strong enough to effectively wipe out much of 
the phylogenetic information in the molecule.
    The other possibility is one mentioned by Kemmerer et al. in their 
companion paper in Mol. Biol. and Evol. in which they suggest that the 
problem is comparison of paralogous genes.  If there are several copies of 
the cytochrome c gene, then each copy will have its own phylogeny.  If the 
sequence for Arabidopsis is from one copy of the gene, and at least some 
of the other plant sequences are from another copy (or, worse yet, from 
several other copies), then you are comparing apples and oranges - you 
get a funny tree; the tree won't correspond either to a gene phylogeny or 
to an organismic phylogeny.  I saw a tree a few days ago on sequences of 
glucanases in which that was clearly what had happened. 
    Given all this, I suppose that lateral gene transfer can't be ruled 
out, but I'm not sure it's the most compelling explanation of the pattern.

    My conclusion is that Arabidopsis is probably a fine model for higher 
plants - just that attempts to generalize results from Arabidopsis need to 
be checked.  This is hardly a radical suggestion.  The stronger conclusion 
though is that cytochrome c is probably not a good molecule for exploring 
organismic phylogeny.  Other people have reached this conclusion before.  
Maybe what we need now is some exploration of why it isn't much use.
Elizabeth Kellogg, Dept. of Molecular Biology, Mass. General Hospital, 
Boston, MA; and Arnold Arboretum of Harvard University, 22 Divinity Ave., 
Cambridge, MA 02138   kellogg@frodo.mgh.harvard.edu