Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!olivea!apple!bionet!FRODO.MGH.HARVARD.EDU!KELLOGG From: KELLOGG@FRODO.MGH.HARVARD.EDU Newsgroups: bionet.genome.arabidopsis Subject: Arabidopsis evolution Message-ID: <910501120128.45d@Frodo.MGH.Harvard.EDU> Date: 1 May 91 16:01:28 GMT Sender: daemon@genbank.bio.net Lines: 62 I'm not sure I'd be all that worried about the weird placement of Arabidopsis on the cytochrome c tree, and I certainly wouldn't question the utility of Arabidopsis as a model system. There are several reasons: 1. Trees based on cytochrome c tend to be peculiar in general. For example, if you check the trees published by Syvanen et al. (JME, 1989, ref. in the Kemmerer paper) you can find all sorts of oddities. In their fig. 5 they show a minimal eukaryotic cyctochrome c tree - the two fish don't come out together; on the vertebrate clade, the frog is the most basal group, more primitive even than the carp; sesame (a dicot) comes out with rice (a monocot); the grasses (wheat, maize and rice) form a paraphyletic rather than monophyletic group. 2. The tree published by Kemmerer et al. is not particularly robust. The numbers beside the branches are bootstrap replicates, not mutations. Thus the only groupings that appear in at least half of the bootstrap samples are the fungus + Arabidopsis clade and the higher plant clade. Those who choose to interpret bootstraps samples in terms of statistical significance would say those groupings are significant at the 50% level - i.e. not very significant. If you want to use a 95% level, then only the Neurospora + Arabidopsis grouping is significant and the rest of the tree is phylogenetically meaningless. The statistical interpretation of bootstraps is under heavy fire at the moment, so I wouldn't push it too far; however, if the animal grouping only appears in 19 out of 50 replicates, it doesn't seem overly convincing. 3. Based on the phenograms, the Arabidopsis sequence is not particularly similar to that of Neurospora - the similarity is much less than that among the higher plants. So it may not be much like a higher plant cytochrome, but it isn't all that much like a Neurospora cytochrome either. Arabidopsis is on a long branch, and it has been extensively documented that "long branches attract" in phylogenetic analyses (the so-called Felsenstein zone). The best way to correct that problem is to increase the sampling density around the problematic taxon. The implication of 1-3 is that cytochrome c may not be much use as indicator of relationship. On the other hand, the fact that similarities in the molecules do not appear to be determined primarily by phylogeny means that there must be some really interesting molecular biology going on - this gets back to the fitness landscape described by Ulrich Melcher. Possibly selection has been strong enough to effectively wipe out much of the phylogenetic information in the molecule. The other possibility is one mentioned by Kemmerer et al. in their companion paper in Mol. Biol. and Evol. in which they suggest that the problem is comparison of paralogous genes. If there are several copies of the cytochrome c gene, then each copy will have its own phylogeny. If the sequence for Arabidopsis is from one copy of the gene, and at least some of the other plant sequences are from another copy (or, worse yet, from several other copies), then you are comparing apples and oranges - you get a funny tree; the tree won't correspond either to a gene phylogeny or to an organismic phylogeny. I saw a tree a few days ago on sequences of glucanases in which that was clearly what had happened. Given all this, I suppose that lateral gene transfer can't be ruled out, but I'm not sure it's the most compelling explanation of the pattern. My conclusion is that Arabidopsis is probably a fine model for higher plants - just that attempts to generalize results from Arabidopsis need to be checked. This is hardly a radical suggestion. The stronger conclusion though is that cytochrome c is probably not a good molecule for exploring organismic phylogeny. Other people have reached this conclusion before. Maybe what we need now is some exploration of why it isn't much use. Elizabeth Kellogg, Dept. of Molecular Biology, Mass. General Hospital, Boston, MA; and Arnold Arboretum of Harvard University, 22 Divinity Ave., Cambridge, MA 02138 kellogg@frodo.mgh.harvard.edu