© 2006 Society of Systematic Biologists
Multiple Sequence Alignment Accuracy and Phylogenetic Inference
Center for Evolutionary Functional Genomics, The Biodesign Institute, and the School of Life Sciences, Arizona State University Tempe, Arizona 85287–4501 USA E-mail: heath_ogden{at}asu.edumsr{at}asu.edu
Edited by Rod Page: Associate Editor
| Abstract |
|---|
Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
Keywords: Bayesian; maximum likelihood; maximum parsimony; multiple sequence alignment; neighbor joining; phylogenetics; simulation; tree reconstruction
Received July 14, 2005; Revised October 17, 2005; Accepted November 25, 2005
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
B. Longdon, D. J. Obbard, and F. M. Jiggins Sigma viruses from three species of Drosophila form a major new clade in the rhabdovirus phylogeny Proc R Soc B, January 7, 2010; 277(1678): 35 - 44. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Liu, S. Raghavan, S. Nelesen, C. R. Linder, and T. Warnow Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees Science, June 19, 2009; 324(5934): 1561 - 1564. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Lehtonen Phylogeny Estimation and Alignment via POY versus Clustal + PAUP*: A Response to Ogden and Rosenberg (2007) Syst Biol, August 1, 2008; 57(4): 653 - 657. [Full Text] [PDF] |
||||
![]() |
E. Benavides, R. Baum, D. McClellan, and J. W. Sites Molecular Phylogenetics of the Lizard Genus Microlophus (Squamata:Tropiduridae): Aligning and Retrieving Indel Signal from Nuclear Introns Syst Biol, October 1, 2007; 56(5): 776 - 797. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Talavera and J. Castresana Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from Protein Sequence Alignments Syst Biol, August 1, 2007; 56(4): 564 - 577. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Ogden and M. S. Rosenberg Alignment and Topological Accuracy of the Direct Optimization approach via POY and Traditional Phylogenetics via ClustalW + PAUP Syst Biol, April 1, 2007; 56(2): 182 - 193. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Hohl and M. A. Ragan Is Multiple-Sequence Alignment Required for Accurate Inference of Phylogeny? Syst Biol, April 1, 2007; 56(2): 206 - 221. [Abstract] [Full Text] [PDF] |
||||


