© 2007 Society of Systematic Biologists
The Importance of Data Partitioning and the Utility of Bayes Factors in Bayesian Phylogenetics
1 Section of Integrative Biology, The University of Texas–Austin 1 University Station C0930, Austin, TX, 78712, USA E-mail: jembrown{at}mail.utexas.edu (J.M.B.) alemmon{at}evotutor.org (A.R.L.)
Edited by Elizabeth Jockusch: Associate Editor
| Abstract |
|---|
As larger, more complex data sets are being used to infer phylogenies, accuracy of these phylogenies increasingly requires models of evolution that accommodate heterogeneity in the processes of molecular evolution. We investigated the effect of improper data partitioning on phylogenetic accuracy, as well as the type I error rate and sensitivity of Bayes factors, a commonly used method for choosing among different partitioning strategies in Bayesian analyses. We also used Bayes factors to test empirical data for the need to divide data in a manner that has no expected biological meaning. Posterior probability estimates are misleading when an incorrect partitioning strategy is assumed. The error was greatest when the assumed model was underpartitioned. These results suggest that model partitioning is important for large data sets. Bayes factors performed well, giving a 5% type I error rate, which is remarkably consistent with standard frequentist hypothesis tests. The sensitivity of Bayes factors was found to be quite high when the across-class model heterogeneity reflected that of empirical data. These results suggest that Bayes factors represent a robust method of choosing among partitioning strategies. Lastly, results of tests for the inclusion of unexpected divisions in empirical data mirrored the simulation results, although the outcome of such tests is highly dependent on accounting for rate variation among classes. We conclude by discussing other approaches for partitioning data, as well as other applications of Bayes factors.
Keywords: Bayes factors; Bayesian phylogenetic inference; data partitioning; model choice; posterior probabilities
Received October 17, 2006; Revised January 4, 2007; Accepted May 1, 2007
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. C. Marshall Cryptic Failure of Partitioned Bayesian Phylogenetic Analyses: Lost in the Land of Long Trees Syst Biol, November 17, 2009; (2009) syp080v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Tank and R. G. Olmstead The evolutionary origin of a second radiation of annual Castilleja (Orobanchaceae) species in South America: The role of long distance dispersal and allopolyploidy Am. J. Botany, October 1, 2009; 96(10): 1907 - 1921. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. E. Roberts, E. J. Sargis, and L. E. Olson Networks, Trees, and Treeshrews: Assessing Support and Identifying Conflict with Multiple Loci and a Problematic Root Syst Biol, June 16, 2009; (2009) syp025v3. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Lemmon, J. M. Brown, K. Stanger-Hall, and E. M. Lemmon The Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference Syst Biol, May 22, 2009; (2009) syp017v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Weadick and B. S.W. Chang Molecular Evolution of the {beta}{gamma} Lens Crystallin Superfamily: Evidence for a Retained Ancestral Function in {gamma}N Crystallins? Mol. Biol. Evol., May 1, 2009; 26(5): 1127 - 1142. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mathews Phylogenetic relationships among seed plants: Persistent questions and the limits of molecular data Am. J. Botany, January 1, 2009; 96(1): 228 - 236. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Renner, G. W. Grimm, G. M. Schneeweiss, T. F. Stuessy, and R. E. Ricklefs Rooting and Dating Maples (Acer) with an Uncorrelated-Rates Molecular Clock: Implications for North American/Asian Disjunctions Syst Biol, October 1, 2008; 57(5): 795 - 808. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Li, G. Lu, and G. Orti Optimal Data Partitioning and a Test Case for Ray-Finned Fishes (Actinopterygii) Based on Ten Nuclear Loci Syst Biol, August 1, 2008; 57(4): 519 - 539. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Lemmon and E. M. Lemmon A Likelihood Framework for Estimating Phylogeographic History on a Continuous Landscape Syst Biol, August 1, 2008; 57(4): 544 - 561. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Dohrmann, D. Janussen, J. Reitner, A. G. Collins, and G. Worheide Phylogeny and Evolution of Glass Sponges (Porifera, Hexactinellida) Syst Biol, June 1, 2008; 57(3): 388 - 405. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Wiens, C. A. Kuczynski, S. A. Smith, D. G. Mulcahy, J. W. Sites Jr., T. M. Townsend, and T. W. Reeder Branch Lengths, Support, and Congruence: Testing the Phylogenomic Approach with 20 Nuclear Loci in Snakes Syst Biol, June 1, 2008; 57(3): 420 - 431. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Tank and R. G. Olmstead From annuals to perennials: phylogeny of subtribe Castillejinae (Orobanchaceae) Am. J. Botany, May 1, 2008; 95(5): 608 - 625. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Clarke and K. M. Middleton Mosaicism, Modules, and the Evolution of Birds: Results from a Bayesian Approach to the Study of Morphological Evolution Using Discrete Character Data Syst Biol, April 1, 2008; 57(2): 185 - 201. [Abstract] [Full Text] [PDF] |
||||


