Skip Navigation



Systematic Zoology Advance Access published online on May 22, 2009

Systematic Zoology, doi:10.1093/sysbio/syp017
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplementary Appendices
Right arrow All Versions of this Article:
58/1/130    most recent
syp017v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lemmon, A. R.
Right arrow Articles by Lemmon, E. M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Lemmon, A. R.
Right arrow Articles by Lemmon, E. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © Society of Systematic Biologists

The Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference

Alan R. Lemmon1,2,3,*, Jeremy M. Brown1, Kathrin Stanger-Hall4 and Emily Moriarty Lemmon1,3

1 Section of Integrative Biology, University of Texas at Austin, 1 University Station C0930, Austin, TX 78712, USA
2 Present address: Department of Scientif ic Computing, Florida State University, Dirac Science Library, Tallahassee, FL 32306-4120, USA
3 Present address: Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
4 Plant Biology Department, University of Georgia, 403 Biosciences Building, Athens, GA 30602, USA

* Correspondence to be sent to: Department of Scientif ic Computing, Florida State University, Dirac Science Library, Tallahassee, FL 32306-4120, USA; E-mail: alemmon{at}evotutor.org.


   Abstract

Although an increasing number of phylogenetic data sets are incomplete, the effect of ambiguous data on phylogenetic accuracy is not well understood. We use 4-taxon simulations to study the effects of ambiguous data (i.e., missing characters or gaps) in maximum likelihood (ML) and Bayesian frameworks. By introducing ambiguous data in a way that removes confounding factors, we provide the first clear understanding of 1 mechanism by which ambiguous data can mislead phylogenetic analyses. We find that in both ML and Bayesian frameworks, among-site rate variation can interact with ambiguous data to produce misleading estimates of topology and branch lengths. Furthermore, within a Bayesian framework, priors on branch lengths and rate heterogeneity parameters can exacerbate the effects of ambiguous data, resulting in strongly misleading bipartition posterior probabilities. The magnitude and direction of the ambiguous data bias are a function of the number and taxonomic distribution of ambiguous characters, the strength of topological support, and whether or not the model is correctly specified. The results of this study have major implications for all analyses that rely on accurate estimates of topology or branch lengths, including divergence time estimation, ancestral state reconstruction, tree-dependent comparative methods, rate variation analysis, phylogenetic hypothesis testing, and phylogeographic analysis.

Keywords: Ambiguous characters; ambiguous data; Bayesian; bias; maximum likelihood; missing data; model misspecification; phylogenetics; posterior probabilities; prior

Received October 8, 2007; Revised January 10, 2008; Accepted December 30, 2008


Associate Editor: Lars Jermiin


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Syst BiolHome page
R. C. Thomson and H. B. Shaffer
Sparse Supermatrices for Phylogenetic Inference: Taxonomy, Alignment, Rogue Taxa, and the Phylogeny of Living Turtles
Syst Biol, November 11, 2009; (2009) syp075v1.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.