Skip Navigation


Systematic Biology Advance Access originally published online on September 21, 2009
Systematic Biology 2009 58(6):560-572; doi:10.1093/sysbio/syp056
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
58/6/560    most recent
syp056v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Campbell, V.
Right arrow Articles by Lapointe, F.-J.
PubMed
Right arrow Articles by Campbell, V.
Right arrow Articles by Lapointe, F.-J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author(s) 2009. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

The Use and Validity of Composite Taxa in Phylogenetic Analysis

Véronique Campbell* and François-Joseph Lapointe

Département de sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada

* Correspondence to be sent to: Département de sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada; E-mail: veronique.campbell{at}umontreal.ca.


   Abstract

In phylogenetic analysis, one possible approach to minimize missing data in DNA supermatrices consists in sampling sequences from different species to obtain a complete sequence for all genes included in the study. We refer to those complete sequences as composite taxa because DNA sequences that are combined belong to different species. An alternative approach is to analyze incomplete supermatrices by coding unavailable DNA sequences as missing. The accuracy of phylogenetic trees estimated using matrices that include composite taxa has recently been questioned, and the best approach for analyzing incomplete supermatrices is highly debated. Through computer simulations, we compared the phylogenetic accuracy of the 2 competing approaches. We explored the effect of composite taxa when inferring higher level relationships, that is, relationships between monophyletic groups. DNA sequences were simulated on a 42-taxon model tree and incomplete supermatrices containing different percentages of missing data were generated. These incomplete supermatrices were analyzed either by coding the missing data with "?" or by reducing the amount of missing data through the combination of 2 or more taxa to generate composite taxa. Of 180 comparisons (18 simulation cases with 2 different inference methods and 5 levels of incompleteness), we observed significantly higher phylogenetic accuracies for composite matrices in 46 comparisons, whereas missing data matrices outperformed composites in 8 comparisons. In all other cases, the phylogenetic accuracy obtained with composite matrices was not significantly different from that of missing data matrices. This study demonstrates that composite taxa represent an interesting approach to minimize the amount of missing data in supermatrices and we suggest that it is the optimal approach to use in phylogenomic studies to reduce computing time.

Keywords: Composite sequences; computer simulations; DNA sequences; missing data; phylogenetic accuracy; phylogenomics; supermatrices

Received June 10, 2008; Revised September 19, 2008; Accepted August 17, 2009


Associate Editor: Mark Hafner


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.