© 2007 Society of Systematic Biologists
Detecting and Overcoming Systematic Errors in Genome-Scale Phylogenies
1 Canadian Institute for Advanced Research, Centre Robert Cedergren, Département de Biochimie, Université de Montréal 2900 Boulevard Édouard-Montpetit, Montréal, Québec, H3T 1J4, Canada E-mail: Herve.Philippe{at}UMontreal.CA (H.P.)
2 Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, UMR 5506, CNRS-Université de Montpellier 2 161, rue Ada, 34392, Montpellier Cedex 5, France
Edited by Frank Anderson: Associate Editor
| Abstract |
|---|
Genome-scale data sets result in an enhanced resolution of the phylogenetic inference by reducing stochastic errors. However, there is also an increase of systematic errors due to model violations, which can lead to erroneous phylogenies. Here, we explore the impact of systematic errors on the resolution of the eukaryotic phylogeny using a data set of 143 nuclear-encoded proteins from 37 species. The initial observation was that, despite the impressive amount of data, some branches had no significant statistical support. To demonstrate that this lack of resolution is due to a mutual annihilation of phylogenetic and nonphylogenetic signals, we created a series of data sets with slightly different taxon sampling. As expected, these data sets yielded strongly supported but mutually exclusive trees, thus confirming the presence of conflicting phylogenetic and nonphylogenetic signals in the original data set. To decide on the correct tree, we applied several methods expected to reduce the impact of some kinds of systematic error. Briefly, we show that (i) removing fast-evolving positions, (ii) recoding amino acids into functional categories, and (iii) using a site-heterogeneous mixture model (CAT) are three effective means of increasing the ratio of phylogenetic to nonphylogenetic signal. Finally, our results allow us to formulate guidelines for detecting and overcoming phylogenetic artefacts in genome-scale phylogenetic analyses.
Keywords: Compositional heterogeneity; data removal; eukaryotic phylogeny; inconsistency; long-branch attraction; nonphylogenetic signal; phylogenomics; systematic error
Received July 21, 2006; Revised October 17, 2006; Accepted November 28, 2006
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. Deschamps and D. Moreira Signal Conflicts in the Phylogeny of the Primary Photosynthetic Eukaryotes Mol. Biol. Evol., December 1, 2009; 26(12): 2745 - 2753. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Turmel, C. Otis, and C. Lemieux The Chloroplast Genomes of the Green Algae Pedinomonas minor, Parachlorella kessleri, and Oocystis solitaria Reveal a Shared Ancestry between the Pedinomonadales and Chlorellales Mol. Biol. Evol., October 1, 2009; 26(10): 2317 - 2331. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Csuros and I. Miklos Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model Mol. Biol. Evol., September 1, 2009; 26(9): 2087 - 2095. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S. Haggerty, F. J. Martin, D. A. Fitzpatrick, and J. O. McInerney Gene and genome trees conflict at many levels Phil Trans R Soc B, August 12, 2009; 364(1527): 2209 - 2219. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. J. Wurdack and C. C. Davis Malpighiales phylogenetics: Gaining ground on one of the most recalcitrant clades in the angiosperm tree of life Am. J. Botany, August 1, 2009; 96(8): 1551 - 1570. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. C. Sheffield, H. Song, S. L. Cameron, and M. F. Whiting Nonstationary Evolution and Compositional Heterogeneity in Beetle Mitochondrial Phylogenomics Syst Biol, August 1, 2009; 58(4): 381 - 394. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Misof and K. Misof A Monte Carlo Approach Successfully Identifies Randomness in Multiple Sequence Alignments: A More Objective Means of Data Exclusion Syst Biol, May 20, 2009; (2009) syp006v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Inagaki, Y. Nakajima, M. Sato, M. Sakaguchi, and T. Hashimoto Gene Sampling Can Bias Multi-Gene Phylogenetic Inferences: The Relationship between Red Algae and Green Plants as a Case Study Mol. Biol. Evol., May 1, 2009; 26(5): 1171 - 1178. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Hampl, L. Hug, J. W. Leigh, J. B. Dacks, B. F. Lang, A. G. B. Simpson, and A. J. Roger Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic "supergroups" PNAS, March 10, 2009; 106(10): 3859 - 3864. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. C. Pratt, G. C. Gibb, M. Morgan-Richards, M. J. Phillips, M. D. Hendy, and D. Penny Toward Resolving Deep Neoaves Phylogeny: Data, Signal Enhancement, and Priors Mol. Biol. Evol., February 1, 2009; 26(2): 313 - 326. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mathews Phylogenetic relationships among seed plants: Persistent questions and the limits of molecular data Am. J. Botany, January 1, 2009; 96(1): 228 - 236. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Regier, J. W. Shultz, A. R. D. Ganley, A. Hussey, D. Shi, B. Ball, A. Zwick, J. E. Stajich, M. P. Cummings, J. W. Martin, et al. Resolving Arthropod Phylogeny: Exploring Phylogenetic Signal within 41 kb of Protein-Coding Nuclear Gene Sequence Syst Biol, December 1, 2008; 57(6): 920 - 938. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Burki, K. Shalchian-Tabrizi, and J. Pawlowski Phylogenomics reveals a new 'megagroup' including most photosynthetic eukaryotes Biol Lett, August 23, 2008; 4(4): 366 - 369. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Lartillot and H. Philippe Improvement of molecular phylogenetic inference and the phylogeny of Bilateria Phil Trans R Soc B, April 27, 2008; 363(1496): 1463 - 1472. [Abstract] [Full Text] [PDF] |
||||





