Systematic Biology Advance Access originally published online on July 15, 2009
Systematic Biology 2009 58(4):381-394; doi:10.1093/sysbio/syp037
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© Society of Systematic Biologists
Nonstationary Evolution and Compositional Heterogeneity in Beetle Mitochondrial Phylogenomics
1 Department of Biology, Brigham Young University, Provo, UT 84602, USA
2 Program in Computational Biology & Bioinformatics, Institute for Genome Sciences and Policy, Duke University, Box 90090, Durham, NC 27708, USA
3 Australian National Insect Collection, Commonwealth Scientific and Industrial Research Organisation, Entomology, PO Box 1700, Canberra, Australian Capital Territory, 2601, Australia
* Correspondence to be sent to: Program in Computational Biology & Bioinformatics, Institute for Genome Sciences and Policy, Duke University, Box 90090, Durham, NC 27708, USA; E-mail: nathan.sheffield{at}duke.edu.
| Abstract |
|---|
Many published phylogenies are based on methods that assume equal nucleotide composition among taxa. Studies have shown, however, that this assumption is often not accurate, particularly in divergent lineages. Nonstationary sequence evolution, when taxa in different lineages evolve in different ways, can lead to unequal nucleotide composition. This can cause inference methods to fail and phylogenies to be inaccurate. Recent advancements in phylogenetic theory have proposed new models of nonstationary sequence evolution; these models often outperform equivalent stationary models. A variety of new phylogenetic software implementing such models has been developed, but the studies employing the new methodology are still few. We discovered convergence of nucleotide composition within mitochondrial genomes of the insect order Coleoptera (beetles). We found variation in base content both among species and among genes in the genome. To this data set, we have applied a broad range of phylogenetic methods, including some traditional stationary models of evolution and all the more recent nonstationary models. We compare 8 inference methods applied to the same data set. Although the more commonly used methods universally fail to recover established clades, we find that some of the newer software packages are more appropriate for data of this nature. The software packages p4, PHASE, and nhPhyML were able to overcome the systematic bias in our data set, but parsimony, MrBayes, NJ, LogDet, and PhyloBayes were not.
Keywords: Base compositional heterogeneity; Coleoptera; LogDet; model of evolution; nonstationary evolution; nucleotide composition; phylogeny
Received October 17, 2008; Revised January 7, 2009; Accepted May 21, 2009
Nathan C. Sheffield and Hojun Song have contributed equally to this work.