Systematic Zoology Advance Access published online on June 4, 2009
Systematic Zoology, doi:10.1093/sysbio/syp008
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© Society of Systematic Biologists
Properties of Consensus Methods for Inferring Species Trees from Gene Trees
1 Department of Human Genetics, 1241 East Catherine Street, University of Michigan, Ann Arbor, MI 48109-0618, USA
2 Center for Computational Medicine and Biology, 2017 Palmer Commons, 100 Washtenaw Avenue, University of Michigan, Ann Arbor, MI 48109-2218, USA
3 Department of Mathematics, University of Auckland, Private Bag 29019, Auckland, New Zealand
4 Present address: Department of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand
* Correspondence to be sent to: Department of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand; E-mail: J.Degnan{at}math.canterbury.ac.nz.
| Abstract |
|---|
Consensus methods provide a useful strategy for summarizing information from a collection of gene trees. An important application of consensus methods is to combine gene trees to estimate a species tree. To investigate the theoretical properties of consensus trees that would be obtained from large numbers of loci evolving according to a basic evolutionary model, we construct consensus trees from rooted gene trees that occur in proportion to gene-tree probabilities derived from coalescent theory. We consider majority-rule, rooted triple (R*), and greedy consensus trees obtained from known, rooted gene trees, both in the asymptotic case as numbers of gene trees approach infinity and for finite numbers of genes. Our results show that for some combinations of species-tree branch lengths, increasing the number of independent loci can make the rooted majority-rule consensus tree more likely to be at least partially unresolved. However, the probability that the R* consensus tree has the species-tree topology approaches 1 as the number of gene trees approaches
. Although the greedy consensus algorithm can be the quickest to converge on the correct species-tree topology when increasing the number of gene trees, it can also be positively misleading. The majority-rule consensus tree is not a misleading estimator of the species-tree topology, and the R* consensus tree is a statistically consistent estimator of the species-tree topology. Our results therefore suggest a method for using multiple loci to infer the species-tree topology, even when it is discordant with the most likely gene tree.
Keywords: Anomalous gene tree; coalescence; discordance; lineage sorting; phylogenetics; statistical consistency
Received April 17, 2008; Revised July 7, 2008; Accepted October 22, 2008
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
H. Huang and L. L. Knowles What Is the Danger of the Anomaly Zone for Empirical Phylogenetics? Syst Biol, October 1, 2009; 58(5): 527 - 536. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Liu and S. V. Edwards Phylogenetic Analysis in the Anomaly Zone Syst Biol, August 1, 2009; 58(4): 452 - 460. [Full Text] [PDF] |
||||
