© 2007 Society of Systematic Biologists
Alignment and Topological Accuracy of the Direct Optimization approach via POY and Traditional Phylogenetics via ClustalW + PAUP*
Edited by Karl Kjer: Associate Editor
1 Idaho State University, Department of Biological Sciences Pocatello, Idaho 83209, USA E-mail: ogdet{at}isu.edu
2 Center for Evolutionary Functional Genomics, The Biodesign Institute, and the School of Life Sciences, Arizona State University Tempe, Arizona 85287-4501, USA E-mail: msr{at}asu.edu (M.S.R.)
| Abstract |
|---|
|
|
|---|
Direct optimization frameworks for simultaneously estimating alignments and phylogenies have recently been developed. One such method, implemented in the program POY, is becoming more common for analyses of variable length sequences (e.g., analyses using ribosomal genes) and for combined evidence analyses (morphology + multiple genes). Simulation of sequences containing insertion and deletion events was performed in order to directly compare a widely used method of multiple sequence alignment (ClustalW) and subsequent parsimony analysis in PAUP* with direct optimization via POY. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (clocklike, non-clocklike, and ultrametric). Alignment accuracy scores for the implied alignments from POY and the multiple sequence alignments from ClustalW were calculated and compared. In almost all cases (99.95%), ClustalW produced more accurate alignments than POY-implied alignments, judged by the proportion of correctly identified homologous sites. Topological accuracy (distance to the true tree) for POY topologies and topologies generated under parsimony in PAUP* from the ClustalW alignments were also compared. In 44.94% of the cases, Clustal alignment tree reconstructions via PAUP* were more accurate than POY, whereas in 16.71% of the cases POY reconstructions were more topologically accurate (38.38% of the time they were equally accurate). Comparisons between POY hypothesized alignments and the true alignments indicated that, on average, as alignment error increased, topological accuracy decreased.
Keywords: ClustalW; direct optimization; multiple sequence alignment; parsimony; phylogenetics; POY; sensitivity analysis; simulation; tree reconstruction
Received November 17, 2005; Revised February 26, 2006; Accepted October 19, 2006
There are many approaches available for phylogenetic analysis of DNA sequence information. Traditionally, most analyses proceed as a two-step process, first moving from raw sequence data to a multiple sequence alignment and then, secondarily, using the multiple sequence alignment to estimate a phylogeny. For the first step, there are dozens of programs and algorithms available that will perform multiple sequence alignment. Once the multiple sequence alignment (data matrix) is created, the data are analyzed by tree building methodologies resulting in hypothesized trees that are used to infer the evolution of the sequences (and ultimately the organisms themselves). Among the most widely used tree reconstruction approaches are neighbor joining, parsimony, maximum likelihood, and Bayesian approaches. Recently, however, other approaches have surfaced that can analyze DNA sequence data in a different framework.
Direct optimization (DO) is an alternative approach for phylogenetic analysis in which no prior multiple sequence alignment is required. This idea has attracted much attention due to the fact that empirically derived phylogenies may be more dependent upon the alignment method than on the mode of phylogenetic reconstruction (Cammarano et al., 1999; Hwang et al., 1998; Kjer, 1995, 2004; Lake, 1991; Morrison and Ellis, 1997; Mugridge et al., 2000; Ogden and Whiting, 2003; Thorne and Kishino, 1992; Titus and Frost, 1996; Xia et al., 2003). Although some related ideas concerning optimization heuristics for raw sequence data existed prior to DO (Hein, 1989a, 1989b; Hogeweg and Hesper, 1984; Sankoff, 1975; Thorne et al., 1991), Wheeler (1996) developed an automated cladistic or parsimony approach. DO, formerly known as optimization alignment, was devised to counter the lack of interaction between topology and putative homology (Wheeler, 2001) and to assess directly the transformations, indels, or other evolutionary events simultaneously in a topological framework without the use of multiple sequence alignment (Wheeler, 1996). The strategy is implemented in the program POY (Wheeler et al., 2003), which can invoke both parsimony and likelihood as optimality criteria (Wheeler, 2006). Recently, a number of similar approaches to DO that implement Bayesian or likelihood models in a combined analysis framework have also been proposed (Fleissner et al., 2005; Lunter et al., 2005; Redelings and Suchard, 2005). DO is a novel theoretical approach to phylogenetic estimation that attempts to avoid the problems of alignment by generalizing phylogenetic character analysis to include insertion/deletion events.
Although theoretical arguments for and against DO, and POY in particular, exist (Giribet, 2001, 2005; Kjer, 2004; Ogden et al., 2005; Phillips et al., 2000; Simmons, 2004; Simmons and Ochoterena, 2000; Wheeler, 2001, 2003), there have been no accuracy comparative analyses performed to evaluate "alternative phylogenetic methods or models" (de Queiroz and Poe, 2001), although some studies have used other forms of comparison (for example, ILD and likelihood scores) to examine the performance of POY versus other alignment methods (Terry and Whiting, 2005; Whiting et al., 2006). During revision of this paper we became aware of an accepted paper that investigates a comparison of POY versus structural alignment (Kjer et al., 2007). We were interested in using simulation to directly compare ClustalW and subsequent parsimony analysis in PAUP* with DO via POY. Many studies have used simulated fixed data sets (usually with no insertions or deletions) to examine topological accuracy of phylogenetic reconstruction methods (e.g., Hillis, 1995; Huelsenbeck and Rannala, 2004; Nei, 1996; Rosenberg and Kumar, 2003; Takahashi and Nei, 2000, just to name a few). Only recently have alignments been simulated that include indels (Blanchette et al., 2004a, 2004b; Fleissner et al., 2005; Hall, 2005; Keightley and Johnson, 2004; Pollard et al., 2004; Rosenberg, 2005a, 2005b; Stoye et al., 1998). However, none of these studies (except for Fleissner et al., 2005) compare traditional two-step phylogenetic analysis with DO or combined analysis approaches.
The main objective of this paper is to directly compare, within a parsimony framework, the performance of DO (via POY) to phylogenetic methods that first generate a multiple sequence alignment with subsequent tree reconstruction (via ClustalW and PAUP* parsimony). This will be accomplished by (1) calculating and comparing the alignment accuracy score for the implied alignment from POY and the multiple sequence alignment from ClustalW; (2) calculating and comparing the topological accuracy (distance to the true tree) for POY topologies and topologies generated under parsimony in PAUP* from the ClustalW alignments; and (3) investigating the interaction of tree shape (length and branching pattern) with alignment and topological accuracy in relation to the two approaches of sequence analysis.
| Material and Methods |
|---|
|
|
|---|
Data Simulation
We used the simulated data sets, consisting of seven 16-taxon topologies, which we have previously analyzed (Ogden and Rosenberg, 2006). The simulations were done under a variety of different conditions in order to cover a reasonable amount of the error space representing alignment inaccuracy. We believe that 16 terminals are sufficient to provide reasonable tree shape diversity and complexity in order to investigate the effects of alignment inaccuracies and tree reconstruction, while at the same time not requiring enormous amounts of computational time to perform reasonable searches across the thousands of data sets. The seven base topologies (Fig. 1) consist of a balanced tree, a pectinate tree, and five random trees (A to E) generated under a Yule model in Mesquite (Maddison and Maddison, 2004), where the probability for each speciation event is equal for all tips. The relative branch lengths of each topology were set under 11 different conditions: Ultrametric equal-branch length, clocklike random branch length (5 sets), and non-clocklike random branch lengths (5 sets). Each of these 11 conditions was scaled such that the maximum evolutionary distance between a pair of sequences was equal to 1.0 or 2.0. Thus, each of the 7 topologies was used to create 22 model trees (Fig. 2). All simulations were conducted under identical conditions in the program MySSP (Rosenberg, 2005c). For this study, many potentially variable parameters were held constant in order to gain simplicity. Thus, aside from the different conditions explained above, the initial sequence length was set to 2000 base pairs and noncoding DNA evolution was simulated under the Hasegawa-Kishino-Yano (HKY) model (Hasegawa et al., 1985), with transition-transversion bias
= 3.6 (Rosenberg and Kumar, 2003) and initial and expected base frequencies of A and T = 0.2; and G and C = 0.3.
|
|
Insertion and deletion events were modeled as a Poisson process, following Rosenberg (2005a, 2005b). Expected numbers of insertions and deletions (modeled separately) for a given branch were determined as a function of the realized number of substitutions (itself a Poisson process) that occurred on that branch. Expected rates were based on observed values from primates and rodents, with one insertion event for every 100 substitutions and one deletion event for every 40 substitutions (Ophir and Graur, 1997). As an aside, it is important to point out that Ophir and Graur's (1997) paper was based on indels in pseudogenes and that our conclusions may be limited to evolutionary patterns more similar to these types of data. The intent of associating our indel model to these types of data was to initiate this line of research with a more general case. We are currently working on methods for more specific cases, such as rDNA. The realized number of insertion and deletion events was drawn from a Poisson distribution with mean equal to the expectation. The actual size of each insertion and deletion event was independently determined from a truncated (so as not to include zero) Poisson distribution with mean equal to four bases (as observed in primates and rodents; Ophir and Graur, 1997; Sundstrom et al., 2003). Although the underlying mechanisms and frequencies of indels is not understood as well as base substitution processes (Hall, 2005), efforts to remedy this lack of models and methods are underway (Holmes, 2003, 2004, 2005; Holmes and Bruno, 2001; Knudsen and Miyamoto, 2003; Mitchison and Durbin, 1995; Mitchison, 1999).
Each simulation was replicated 100 times. The fate of every insertion and deletion event was tracked throughout the simulations, such that the columns in the final alignment represented the true homologies (Rosenberg, 2005a, 2005b).
Alignment
These simulations resulted in 15,400 unique data sets (alignments) containing gaps representing either insertion or deletion events during the simulation process, which will be referred to as the True Alignments (TA). Each of the TA was then stripped of their gaps and realigned via ClustalW version 1.83 (Thompson et al., 1994) using default parameters. The default parameters in Clustal for DNA sequence alignment are gap opening = 15, gap extension = 6.66, delay divergent % = 30, DNA transition weight = 0.50, and DNA weight matrix = IUB. These parameters were chosen because they are the most commonly used settings by most investigators who implement Clustal. Our main purpose was to compare ClustalW to POY using the most common approach employed for each program and not to try to match exact parameters between the programs, which may not even be possible. We will refer to these alignments as the Clustal Hypothesized Alignments (Clustal HA). Although POY does not require an alignment, we a posteriori generated the implied alignment from the resultant most parsimonious tree, and these alignments will be referred to as POY Hypothesized Alignments (POY HA). The gap:transversion:transisition ratio in POY was 1:1:1 for the majority of the analyses. This particular parameter set was chosen because it has been the most commonly selected in phylogenetic studies in the top systematic journals, based on a literature review from January 2004 to February 2006, where 53% of the studies selected the 1:1:1 parameter set, 31% of the studies did not select the 1:1:1 ratio, and 16% did not report the specific parameter settings used in POY. Notwithstanding, we also performed analyses under varying cost schemes for the pectinate tree (1, 2, 4, and 10 for the gap:tv and tv:ts ratios) in order to examine the possibility that other commonly used cost ratios in POY may recover more accurate topologies and alignments. We chose to do this on the pectinate tree because it was the tree shape where POY performed the best, in terms of topological accuracy, and because the outgroup was fixed (see below for discussion on how POY must search using one outgroup). Even though we examined the resulting topological effects of varying parameters in POY, similar to sensitivity analysis approaches (Aagesen et al., 2005; Laamanen et al., 2005; Ogden and Whiting, 2003; Terry and Whiting, 2005), the focus of this study was not to try to estimate the optimal parameter settings that would generate the most accurate alignment and reconstructed topology. Rather, we wanted to compare the performance of the most common settings that are used in phylogenetic analyses for POY and Clustal.
Alignment Accuracy
Alignment accuracy, calculated as the proportion of aligned sites that are truly homologous (Rosenberg, 2005a), was summarized as the Total Alignment Accuracy (TAA) score. The TAA for a data set was calculated as the average accuracy of all pairwise sequence comparisons in the multiple alignments as judged against the corresponding homologous sites of the true alignments. Thus, we did not compare the Clustal-generated hypothesized alignment directly to the POY-implied alignment, as some have suggested these may not be comparable (Giribet, 2005; Wheeler, 2003). Rather, we directly compared the Clustal-generated hypothesized alignments (HA) to the true alignments; likewise, we independently compared the POY-implied alignments (POY HA) to the true alignments. In other words, we compared the primary homology sites of the Clustal hypothesis to the true secondary homology sites of the simulated data set; and we compared the secondary homology sites from the POY-implied alignment hypotheses to the true secondary homology sites of the simulated data set (de Pinna, 1991; Giribet, 2005). Finally, an indirect comparison of the two hypothesized alignment TAA scores was matched up, head-to-head (Clustal HA versus POY HA), for every simulation replicate for all tree shapes.
Tree Reconstruction
The Clustal HA and TA were analyzed under parsimony using PAUP* version 4.b10 for Windows (Swofford, 2002) consisting of 100 random additions with TBR swapping and all other default settings. In order to compare the performance of POY HA to POY TA (which are the same data sets as Clustal TA), we additionally analyzed all of the TA with gaps as a fifth state character (GapMode = NewState) in PAUP*. This was done because POY HA analyses treated gaps as such, using the 1:1:1 gap:tv:ts ratio. These four different approaches to treating the sequences (HA and TA with gaps as missing in Clustal and PAUP, and POY HA and TA with gaps treated as fifth) allowed direct comparisons of topological accuracy of the resulting trees (consensus in some cases).
We looked at the effects of alignment error on reconstruction accuracy by comparing the TA tree reconstructions to the HA tree reconstructions. Each reconstructed tree was compared to the true model tree using the Robinson-Foulds (1981) measure to estimate topological accuracy; these are referenced as TAdist and HAdist, respectively, for the TA and HA data sets. The difference between these values (HAdist – TAdist) therefore represents the difference in topological accuracy of trees reconstructed from the true and hypothesized alignments. When the TA tree is topologically more accurate than the HA tree, (HAdist – TAdist) will be a positive number; if (HAdist – TAdist) is negative, the HA tree is more accurate that the TA tree. Note that (HAdist – TAdist) itself is not a measure of topological accuracy, but rather a comparison of the accuracies of the TA tree and HA tree reconstructions. Hence, TA and HA trees could both be completely accurate, with a distance to the true tree of 0, and thus a (HAdist – TAdist) equal to 0. Alternatively, they could both be equally inaccurate, with large distances relative to the true tree, and again (HAdist – TAdist) may also be 0 (the reconstructed trees could be completely different, but also completely wrong).
In addition to analyzing the data for a cost ratio of 1:1:1 (gap:tranversion:transition) in POY, a sensitivity analysis was performed across the pectinate cases to examine if other parameter sets may produce more accurate hypotheses of homology and more accurate topologies. For the 1:1:1 analyses across all tree shapes (all the data sets) and for the sensitivity analysis data sets (pectinate tree shape only), the following search strategy was used in POY: -nooneasis-noleading-norandomizeoutgroup-quick-staticapprox-notbr-replicates 4-. Although this strategy does not represent a very thorough search, it was necessary in order to be able to analyze the thousands of data sets in a reasonable amount of time. However, much more extensive analyses (-nooneasis-noleading-norandomizeoutgroup-quick-staticapprox-notbr-replicates 4-buildmaxtrees 2-sprmaxtrees 1-nospr-tbr-tbrmaxtrees 5-maxtrees 5-holdmaxtrees 50-slop 5-checkslop 10-stopat 25-treefusefuselimit 10-fusemingroup 5-fusemaxtrees 50-ratchetspr 2-ratchettbr 2-checkslop 10) on a subset of the data (200 data sets) showed that there was no significant difference in topological accuracy; in fact, the less extensive analyses recovered, on average, more accurate topologies by 0.14 (HAdist – TAdist) distance. Furthermore, alignment accuracy actually decreased from an average of 70% down to 68% for the more extensive searches. Both of these values are much less than the Clustal alignment accuracy average of 86%. Therefore, further searching, at least on these data, does not change the results and conclusions, and we are justified in using the results from the less extensive searches to make general comparisons. Although, again, it should be pointed out that the results from any one single data set might be drastically changed with different efforts of searching.
It is appropriate to summarize the results from our previous publication (Ogden and Rosenberg, 2006) in order to better understand the current study. In general, as alignment error increased, topological accuracy decreased. This trend was much more pronounced for data sets derived from more pectinate topologies. On the other hand, for balanced, clocklike, and equal-branch-length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on averages across the different simulation conditions pooled together. However, it should be pointed out that any one specific analysis, independent of the alignment accuracy, was recovered more accurate, less inaccurate, or equally accurate as the true topology. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and parsimony in terms of tree reconstruction accuracy. The results also indicated that as the length of the branch and of the neighboring branches increased, alignment accuracy decreased, indicating that neighboring branches may be a major factor in topological accuracy. These basic conclusions will be drawn upon throughout this paper; for further details see Ogden and Rosenberg (2006).
| Results |
|---|
|
|
|---|
Of the 15,400 head-to-head comparisons of Clustal HA and POY HA alignments, in only 7 cases were POY alignments more accurate than Clustal alignments, as measured by TAA scores (Table 1 and Fig. 3); in the remaining 15,393 cases, the Clustal HA were more accurate. In other words, 99.95% of the time, ClustalW generated alignments that were more accurate, as judged by the comparison to the truly homologous sites of the true alignment, than POY implied alignments. Clustal alignments ranged from 97.79% to 17.33% accurate with an average of 72.19%, whereas POY-implied alignments ranged from 87.78% to 5.05% accurate with an average of 50.76%. Of the few POY HA cases that outperformed the associated Clustal HA, the maximum TAA difference was 0.07. However, in the numerous cases where Clustal HA outperformed POY HA, the maximum TAA difference was 0.74.
|
|
Although not as drastic as the alignment accuracy comparisons, Clustal HA topological accuracy also outperformed POY tree reconstructions in 6920 cases (44.94%). POY did better in 2573 cases (16.71%), and both performed equally well (or poorly) in 5907 cases (38.36%) (Table 2). Therefore, across all data sets pooled together, Clustal HA reconstructions were, on average, more accurate (in terms of distance to the true tree) than POY reconstructions. As POY TAA decreased, this average difference became larger (Fig. 4). In the 2573 cases where POY reconstructed more accurate topologies than Clustal HA reconstructions, nearly half (1009 cases) were observed in the pectinate tree shape data sets (Table 2). For the pectinate tree shape data, Clustal HA reconstructions recovered only 744 trees that were more accurate than the associated POY HA reconstructions (in 447 cases no difference was seen). Therefore, for the pectinate topologies, over 45% of the time POY reconstructed trees more accurately as compared to 33% where Clustal recovered more accurate trees. (In 20% of the pectinate cases, both approaches recovered equally accurate topologies.) The moving average of all the pectinate tree shape cases showed an interesting trend. There was a noticeable change in the curve for POY TAA accuracies between 0.4 and 0.6 (Fig. 5). In this span of POY TAA, Clustal HA reconstructions were more accurate than POY HA reconstruction (on average). By breaking down the different types of pectinate trees, the results show that the region of POY TAA between 0.4 and 0.6 is predominantly represented by the pectinate, random branch length, and non-clocklike condition. For these cases, the Clustal HA reconstructions vastly outperform the POY reconstructions (on average). Contrastingly, most of the cases where POY HA tree reconstructions outperform Clustal HA tree reconstructions are for data sets derived from pectinate, nonrandom branch length, and more clocklike tree shapes that fall within the POY TAA range less than 0.4 and greater than 0.6 (Fig. 6). Except for these few cases, Clustal+PAUP generally outperformed POY in topological accuracy across all other tree shapes and conditions. Finally, it should be reemphasized that these conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy and methodology, may recover very accurate or inaccurate topologies.
|
|
|
|
The results across the pectinate tree data sets from the varied POY cost ratios (sensitivity analysis) indicate a couple of interesting points (Table 3). First, only 3 of the 35,200 cases of the POY parameter sets investigated recovered more accurate hypotheses of secondary homology than the primary homology hypotheses of Clustal. In other words, Clustal-hypothesized alignments essentially always outperformed POY-implied alignments, regardless of the parameter set used in POY. Second, our results show that, on average, 2:2:1 cost ratio was the most accurate of the POY parameter sets investigated. And four other parameter sets (4:4:1, 4:2:1, 10:10:1, and 2:1:1) also recovered more accurate alignments than the 1:1:1 set. Therefore, although 1:1:1 is the most commonly used (often selected from an ILD test), for our data it did not produce the most accurate homologies. Related to this is the result that the 1:1:1 parameter set did not recover the most accurate topologies either (Table 4). Once again, 2:2:1 recovered more accurate topologies more often then any other cost ratio, followed by 4:4:1 and then 1:1:1. There is a fairly good correlation (r2 = –0.675) between the parameter sets that recover accurate topologies more often than other parameter sets as compared to the parameter sets from which more accurate implied alignments are produced. As an aside, 2:1:1 (gap cost of two relative to nucleotide changes) is the default setting in POY.
|
|
We also examined the effect of alignment accuracy on topological accuracy for the POY analyses. Across all data sets, as POY TAA decreased, (HAdist – TAdist) increased from around 0 (no difference between the POY TA and POY HA) for very accurate alignments to more than 7 for very inaccurate alignments (Fig. 4, blue points and line). This trend was consistent across all tree shapes, except for the pectinate trees, where the maximum (HAdist – TAdist) values were found between 0.40 and 0.60 (HAdist – TAdist) distance (Fig. 5).
| Discussion and Conclusion |
|---|
|
|
|---|
The results clearly indicate that, under the data simulation conditions of this study, Clustal alignments (under default parameters) with subsequent parsimony tree building approach is superior to DO parsimony in POY (under a wide range of parameter sets). This is especially true for the comparisons of Clustal alignments and POY-implied alignments in relation to the true alignments, where over 99% of the time Clustal produces more accurate alignments; i.e., more accurate proportion of truly homologous sites. Similar results have recently been reported for comparisons of Clustal versus the newer likelihood DO-like methods (for example, Fleissner et al., 2005). Therefore, traditional multiple sequence alignment approaches appear to vastly outperform direct optimization-like approaches in terms of alignment accuracy, at least for the data sets and parameter settings that have been examined thus far.
Even though Clustal outperforms POY in terms of alignment accuracy, we know that there are many cases where less accurate alignments will recover more accurate topologies. These results confirm that alignment accuracy is not directly tied to topological accuracy for any one specific data set, but that on average more accurate alignments do lead to more accurate topologies. Furthermore, these results indicate that for these data simulation conditions, Clustal, and most likely other multiple sequence alignment programs, with subsequent phylogenetic analysis (at least in a parsimony framework, but most likely true for others as well) will often lead to more accurate topologies than POY and possibly other direct optimization approaches (Fleissner et al., 2005). This is evident from the result that POY, on average, recovered less accurate alignments than Clustal+PAUP across nearly the entire spread of alignment accuracy (Fig. 4). Only for a small span of the most highly accurate topologies did POY outperform Clustal+PAUP. This area where POY apparently does better may just be an artifact of gap treatment, as POY treated gaps using a 1:1:1 (gap:tv:ts) ratio, which is the PAUP equivalent of treating gaps as a new or fifth state character. Our Clustal+PAUP approach treated gaps as missing. Some of our other work (Ogden and Rosenberg, 2007) indicates that treating gaps as fifth state character is generally a better approach than treating gaps as missing data. Thus, the small area where POY recovers more accurate topologies turns out to be essentially nonexistent when we compared POY to Clustal+PAUP with gaps treated as a fifth state character instead of missing. The difference, across all tree shapes, in topological accuracy between the Clustal+PAUP cases (more accurate) and POY cases (less accurate) was even more extreme when we compared POY to fifth state gap treatment in PAUP. Therefore, although treating gaps as missing was a more conservative approach (as well as representing the more commonly used approach to parsimony-based phylogenetic analysis), it still recovered more accurate topologies on average across all tree shapes and conditions pooled together.
Although Clustal+PAUP recovered, on average, more accurate topologies than POY across all tree shapes, POY outperformed Clustal+PAUP in some single–data set cases (as indicated by points with positive y-axis values in Fig. 4) and numerous pectinate tree shape cases (Table 2). Specifically, our data showed that the pectinate, nonrandom branch length tree shapes were, on average, more accurately reconstructed topologically in POY, even though none of the POY implied alignments were more accurate than the Clustal alignments (Tables 1 and 2, and Fig. 6). It is interesting that the pectinate, nonrandom branch length topologies were essentially the only tree shape type that, on average, resulted in more accurate topologies under POY. This may have something to do with the rooting issue described below, but it may also mean that direct optimization approaches are better at reconstructing topologies with more pectinate shapes and clocklike evolution. However, because we do not know the tree shape for empirical data beforehand, it would generally be better to use approaches that recover relationships more accurately under a variety of tree shapes instead of being biased toward a specific tree shape.
The cost ratios of the sensitivity analysis, on average, did not perform as well as the Clustal+PAUP analyses. To reiterate the example, only 3 out of the 35,200 implied alignments identified more truly homologous sites than the Clustal hypotheses of homology. So even though POY was permitted to explore the parameter cost space, it still did not produce more accurate alignments. Although only 9584 POY reconstructions across the parameter space were more accurate than Clustal HA reconstructions, the 2:2:1 was more accurate in 1113 of the possible 2200 cases. In other words, for the pectinate tree shape cases, more often than not, using a 2:2:1 parameter set in POY recovered more accurate topologies. This is in spite of the fact that there was only one case where the implied alignment actually had more homologous sites hypothesized. Another interesting trend that is apparent from the sensitivity analysis is that cases where the ratio of the gap and transversion cost is equal (2:2:1, 4:4:1, 1:1:1, and 10:10:1) resulted in the most accurate topologies and implied alignments within the POY analyses. As many research studies use congruence to select among parameters, it would be interesting to see if the ILD test actually selects the best parameter set (as measured by the set that gives the highest accuracy) in a future simulation study.
Critics may have concerns with our experimental design and conclusions that we make based on the results. One possible criticism of the current study could be that with an increased taxon sampling, POY may outperform traditional methods. In order to address this issue, we also performed 100 simulations on a 64-taxon clocklike random branch length topology. Analyses of these simulations showed that while in the 16-taxon trees only around 44% of the time Clustal+PAUP recovers more accurate topologies, in the 64-taxon cases, 93% of the time Clustal+PAUP recovers more accurate topologies. Apparently, the benefit of adding more taxa is greater in Clustal+PAUP (alignment, branch swapping, tree searching, etc.) than in POY.
Another possible criticism could be that we do not use tree length to compare among competing topologies from POY and Clustal+PAUP. The tree length measure is not reported because we do not think that tree length is an appropriate optimality criterion for different data sets (different alignments). Once you have different hypotheses of homology (different alignments), you have different data sets. By the same reasoning, you cannot compare distance, likelihood, or Bayesian measures for noncongruent or nonhomologous data sets. This is true even if only one column in the matrix is different. If the matrix is not the same, the resulting topologies should not be compared based on a tree length optimality score. POY claims to circumvent this issue because it goes directly from the sequences to the topology. However, just as it may not be appropriate to take a POY-implied alignment and then run other analyses like bootstraps or likelihood, it likewise may not be appropriate to take a Clustal alignment + PAUP tree reconstruction and run it through POY to output a tree length. In a simulation framework such as ours, what one can do is compare the hypotheses of homologous sites (alignments) to the true homologous sites (simulated data set) and one can compare the hypothesis of branching pattern (tree) to the true branching pattern (simulated tree). But one should not compare tree lengths that have been derived from different hypotheses of homologous sites; it can be done, but in our view it should not be done.
These results support the use of Clustal+PAUP or other two-step approaches as opposed to DO and POY for data sets similar to the ones we simulated. However, it is possible that for alternate types of data sets, the putatively positive aspects of POY may override the problems identified in this study. For example, our data do not include morphology (simulation of morphology is problematic in practice and theory) and we included only one partition or "gene." DO through POY may perform better on multiple partitioned data sets, particularly when simulated under differing evolutionary rates and models (Giribet, 2001; Terry and Whiting, 2005). Moreover, it would seem that the large majority of empirical analyses that have used POY contained at least one ribosomal gene (12S, 16S, 18S, or 28S). These genes are made up of conserved (stem) and unconserved (loop or expansion) regions, and it remains unclear how accurately POY or other combined approaches may perform by combining these data together in simultaneous analysis than independent alignment of each individual region. This study can only suggest that the demonstrated level of superiority of traditional methods of phylogenetic analysis over direct optimization methods for the examined conditions is very convincing.
Aside from any biases that may exist due to the particular nature of the data sets, there may be some alternative explanations for these disparate results. It is possible that the program POY, and not necessarily the theoretical framework of direct optimization-like approaches (but see Fleissner et al., 2005), may have some implementation limitations that bias the results. One issue that appears to be problematic is that POY searches across rooted topologies and thus requires a single specified outgroup. Most other methods of tree reconstruction search across unrooted topologies and rooting is not required until searching has completed (if at all). For our data sets, only the pectinate tree and the random C tree contained the same single outgroup rooting in POY and in the rooted Clustal+PAUP reconstructions (Fig. 1), which is one of the reasons we chose to do the additional extensive analyses and the sensitivity analysis on the pectinate tree shape. Thus, there may be a bias of tree distance to the true topology for the other tree shapes. If we consider only the pectinate and the random C trees, Clustal alignments were always more accurate than POY-implied alignments, but POY HA reconstructions were topologically more accurate 31% of the time (1373 out of 4400 cases), whereas Clustal+PAUP reconstructions were more topologically accurate in 40% of the cases (1761 out of 4400). Still, the outgroup, by definition, is the least closely related taxa to any other taxa in the data set and most likely will also be the most evolutionary distant. Therefore, the comparison of the most basal hypothetical taxon unit (the hypothesized ancestor of the ingroup) to the outgroup may be difficult to correctly align and optimize. Of course, the evolutionary distances of the ingroup and outgroup can be very small, but as they increase, the error in each of the hypothetical taxon units (hypothesized ancestors on each internal node) will most likely also increase. Still, evolutionary distance does bias Clustal alignment accuracy (Pollard et al., 2004; Rosenberg, 2005a, 2005b), but perhaps in a different way than seen in POY.
Despite the above issues and results, the framework of direct optimization may still be a useful way to analyze data under different data sets and/or implementations (i.e., modified parsimony, likelihood, Bayesian, etc.), but further development, exploration, and testing are required. Although our data represents a fairly simple case, for data sets similar to these the traditional two-step approach will almost always give a more accurate alignment and will most likely recover equally or more accurate phylogenetic relationships than direct optimization as implemented in POY.
There are many issues that remain to be studied concerning the performance of direct optimization methods, such as more complicated data sets and alternative frameworks (i.e., likelihood and Bayesian). This study represents the first analysis to directly compare traditional two-step phylogenetic analysis (via Clustal+PAUP) to direct optimization (via POY) in order to analyze both alignment and topological accuracy. In almost all cases (99.95%), ClustalW produced more accurate alignments than POY-implied alignments. Similarly, in 45% of the cases, Clustal alignment tree reconstructions in PAUP* were more topologically accurate than the POY tree reconstructions, which were only more accurate than Clustal+PAUP in 17% of the cases. Varied cost ratios (sensitivity analysis) in POY also performed worse, on average, than Clustal+PAUP, although within POY 2:2:1 was consistently the most accurate parameter set. Finally, the same trend (on average as alignment accuracy increases, topological accuracy increases) found for multiple sequence alignment with tree reconstruction via neighbor joining, parsimony, likelihood, and Bayesian held true for direct optimization via POY. All these conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies.
| Acknowledgments |
|---|
|
|
|---|
We wish to thank G. Giribet, K. Kjer, and an anonymous reviewer for comments and suggestions on an earlier version of this manuscript. Parts of this work were supported by the NIH R03-LM008637 (MSR) and Arizona State University.
| References |
|---|
|
|
|---|
-
Aagesen L., Petersen G., Seberg O. Sequence length variation, indel costs, and congruence in sensitivity analysis. Cladistics (2005) 21:15–30.[Web of Science]
Blanchette M., Green E. D., Miller W., Haussler D. Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res. (2004a) 14:2412–2423.
Blanchette M., Kent W. J., Riemer C., Elnitski L., Smit A. F. A., Roskin K. M., Baertsch R., Rosenbloom K., Clawson H., Green E. D., Haussler D., Miller W. Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner. Genome Res. (2004b) 14:708–715.
Cammarano P., Creti R., Sanangelantoni A. M., Palm P. The Archaea monophyly issue: A phylogeny of translational elongation factor G(2) sequences inferred from an optimized selection of alignment positions. J. Mol. Evol. (1999) 49:524–537.[CrossRef][Web of Science][Medline]
de Pinna M. C. C. Concepts and tests of homology in the cladistic paradigm. Cladistics (1991) 7:367–394.[CrossRef][Web of Science]
de Queiroz K., Poe S. Philosophy and phylogenetic inference: A comparison of likelihood and parsimony methods in the context of Karl Popper's writings on corroboration. Syst. Biol. (2001) 50:305–321.
Fleissner R., Metzler D., Haeseler A. Simultaneous statistical multiple alignment and phylogeny reconstruction. Syst. Biol. (2005) 54:548–561.
Giribet G. Exploring the behavior of POY, a program for direct optimization of molecular data. Cladistics (2001) 17:S60–S70.[CrossRef][Web of Science][Medline]
Giribet G. Generating implied alignments under direct optimization using POY. Cladistics (2005) 21:396–402.[CrossRef][Web of Science]
Hall B. G. Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. Mol. Biol. Evol. (2005) 22:792–802.
Hasegawa M., Kishino K., Yano T. Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. (1985) 22:160–174.[CrossRef][Web of Science][Medline]
Hein J. A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogney is given. Mol. Biol. Evol. (1989a) 6:649–668.[Abstract]
Hein J. A tree reconstruction method that is economical in the number of pairwise comparisons used. Mol. Biol. Evol. (1989b) 6:669–684.[Abstract]
Hillis D. M. Approaches for assessing phylogenetic accuracy. Syst. Biol. (1995) 44:3–16.
Hogeweg P., Hesper B. The alignment of sets of sequences and the construction of phyletic trees: An integrated method. J. Mol. Evol. (1984) 20:175–186.[CrossRef][Web of Science][Medline]
Holmes I. Using guide trees to construct multiple-sequence evolutionary HMMs. Bioinformatics (2003) 19:147i–157i.[Abstract]
Holmes I. A probabilistic model for the evolution of RNA structure. BMC Bioinformat. (2004) 5:166.[CrossRef]
Holmes I. Using evolutionary expectation maximization to estimate indel rates. Bioinformatics (2005) 21:2294–2300.
Holmes I., Bruno W. J. Evolutionary HMMs: A Bayesian approach to multiple alignment. Bioinformatics (2001) 17:803–820.
Huelsenbeck J. P., Rannala B. Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Syst. Biol. (2004) 53:904–913.
Hwang U. W., Kim W., Tautz D., Friedrich M. Molecular phylogenetics at the Felsenstein zone: Approaching the Strepsiptera problem using 5.8S and 28S rDNA sequences. Mol. Phylogenet. Evol. (1998) 9:470–480.[CrossRef][Web of Science][Medline]
Keightley P. D., Johnson T. MCALIGN: Stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res. (2004) 14:442–450.
Kjer K. Aligned 18S and insect phylogeny. Syst. Biol. (2004) 53:506–514.
Kjer K., Gillespie J. J., Ober K. A. Opinions on multiple sequence alignment, and an empirical comparison of repeatability and accuracy between POY and structural alignment. Syst. Biol. (2007) 56:133–156.
Kjer K. M. Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: An example of alignment and data presentation from the frogs. Mol. Phylogenet. Evol. (1995) 4:314–330.[CrossRef][Web of Science][Medline]
Knudsen B., Miyamoto M. M. Sequence alignments and pair hidden Markov models using evolutionary history. J. Mol. Biol. (2003) 333:453–460.[CrossRef][Web of Science][Medline]
Laamanen T. R., Meier R., Miller M. A., Hille A., Wiegmann B. M. Phylogenetic analysis of Themira (Sepsidae: Diptera): Sensitivity analysis, alignment, and indel treatment in a multigene study. Cladistics (2005) 21:258–271.[CrossRef][Web of Science]
Lake J. A. The order of sequence alignment can bias the selection of tree topology. Mol. Biol. Evol. (1991) 8:378–385.[Web of Science][Medline]
Lunter G., Miklos I., Drummond A., Jensen J., Hein J. Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinformat. (2005) 6:83.[CrossRef]
Maddison W. P., Maddison D. R. Mesquite: A modular system for evolutionary analysis, version 1.05. (2004).
Mitchison G., Durbin R. Tree-based maximal likelihood substitution matrices and hidden Markov models. J. Mol. Evol. (Historical Archive) (1995) 41:1139–1151.
Mitchison G. J. A probabilistic treatment of phylogeny and sequence alignment. J. Mol. Evol. (1999) 49:11–22.[CrossRef][Web of Science][Medline]
Morrison D., Ellis J. Effects of nucleotide sequence alignment on phylogeny estimation: A case study of 18S rDNAs of apicomplexa. Mol. Biol. Evol. (1997) 14:428–441.[Abstract]
Mugridge N. B., Morrison D. A., Jakel T., Heckeroth A. R., Tenter A. M., Johnson A. M. Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae. Mol. Biol. Evol. (2000) 17:1842–1853.
Nei M. Phylogenetic analysis in molecular evolutionary genetics. Annu. Rev. Genet. (1996) 30:371–403.[CrossRef][Web of Science][Medline]
Ogden T. H., Rosenberg M. S. Multiple sequence alignment accuracy and phylogenetic inference. Syst. Biol. (2006) 55:314–328.
Ogden T. H., Rosenberg M. S. How should gaps be treated in parsimony? A comparison of approaches using simulation. Mol. Phylogenet. Evol. (2007) 42:817–826.[Web of Science][Medline]
Ogden T. H., Whiting M. The problem with "the Paleoptera Problem": Sense and sensitivity. Cladistics (2003) 19:432–442.[Web of Science]
Ogden T. H., Whiting M. F., Wheeler W. C. Poor taxon sampling, poor character sampling, and non-repeatable analyses of a contrived dataset do not provide a more credible estimate of insect phylogeny: A reply to Kjer. Cladistics (2005) 21:295–302.[CrossRef][Web of Science]
Ophir R., Graur D. Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene (1997) 205:191–202.[CrossRef][Web of Science][Medline]
Phillips A., Janies D., Wheeler W. Multiple sequence alignment in phylogenetic analysis. Mol. Phylogenet. Evol. (2000) 16:317–330.[CrossRef][Web of Science][Medline]
Pollard D., Bergman C., Stoye J., Celniker S., Eisen M. Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformat. (2004) 5:6.[CrossRef]
Redelings B., Suchard M. Joint Bayesian estimation of alignment and phylogeny. Syst. Biol. (2005) 54:401–418.
Robinson D. F., Foulds L. R. Comparison of phylogenetic trees. Math. Biosci. (1981) 53:131–147.[CrossRef][Web of Science]
Rosenberg M. S. Evolutionary distance estimation and fidelity of pair wise sequence alignment. BMC Bioinformat (2005a) 6:102.[CrossRef]
Rosenberg M. S. Multiple sequence alignment accuracy and evolutionary distance estimation. BMC Bioinformat. (2005b) 6:278.[CrossRef]
Rosenberg M. S. MySSP: Non-stationary evolutionary sequence simulation, including indels. Evol. Bioinformat. Online (2005c) 1:51–53.
Rosenberg M. S., Kumar S. Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference. Mol. Biol. Evol. (2003) 20:610–621.
Sankoff D. Minimal mutation trees of sequences. SIAM. J. Appl. Math. (1975) 28:35–42.[CrossRef]
Simmons M. P. Independence of alignment and tree search. Mol. Phylogenet. Evol. (2004) 31:874–879.[CrossRef][Web of Science][Medline]
Simmons M. P., Ochoterena H. Gaps as characters in sequence-based phylogenetic analyses. Syst. Biol. (2000) 49:369–381.
Stoye J., Evers D., Meyer F. Rose: Generating sequence families. Bioinformatics (1998) 14:157–163.
Sundstrom H., Webster M. T., Ellegren H. Is the rate of insertion and deletion mutation male biased?: Molecular evolutionary analysis of avian and primate sex chromosome sequences. Genetics (2003) 164:259–268.
Swofford D. L. PAUP*: Phylogenetic analysis using parsimony (*and other methods), version 4.0b10 (2002) Sunderland, Massachusetts: Sinauer Associates.
Takahashi K., Nei M. Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol. Biol. Evol. (2000) 17:1251–1258.
Terry M. D., Whiting M. F. Comparison of two alignment techniques within a single complex data set: POY versus Clustal. Cladistics (2005) 21:272–281.[CrossRef][Web of Science]
Thompson J. D., Higgins D. G., Gibson T. J. Clustal-W—Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. (1994) 22:4673–4680.
Thorne J. L., Kishino H. Freeing phylogenies from artifacts of alignment. Mol. Biol. Evol. (1992) 9:1148–1162.[Abstract]
Thorne J. L., Kishino H., Felsenstein J. An evolutionary model for the maximum likelihood alignment of sequence evolution. J. Mol. Evol. (1991) 33:114–124.[CrossRef][Web of Science][Medline]
Titus T. A., Frost D. R. Molecular homology assessment and phylogeny in the lizard family Opluridae (Squamata: Iguania). Mol. Phylogenet. Evol. (1996) 6:49–62.[CrossRef][Web of Science][Medline]
Wheeler W. Optimization alignment: The end of multiple sequence alignment in phylogenetics? Cladistics (1996) 12:1–9.[CrossRef][Web of Science]
Wheeler W. Homology and the optimization of DNA sequence data. Cladistics (2001) 17:S3–S11.[CrossRef][Web of Science][Medline]
Wheeler W. C. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search. Cladistics (2003) 19:261–268.[CrossRef][Web of Science][Medline]
Wheeler W. C. Dynamic homology and the likelihood criterion. Cladistics (2006) 22:157–170.[CrossRef][Web of Science]
Wheeler W. C., Gladstein D., De Laet J. POY, version 3.0.11 (2003) American Museum of Natural History.
Whiting A. S., Sites J. J. W., Pellegrino K. C. M., Rodrigues M. T. Comparing alignment methods for inferring the history of the new world lizard genus Mabuya (Squamata: Scincidae). Mol. Phylogenet. Evol. (2006) 38:719–730.[CrossRef][Web of Science][Medline]
Xia X., Xie Z., Kjer K. M. 18S ribosomal RNA and Tetrapod phylogeny. Syst. Biol. (2003) 52:283–295.
This article has been cited by other articles:
![]() |
D. A. Morrison Why Would Phylogeneticists Ignore Computerized Sequence Alignment? Syst Biol, March 25, 2009; (2009) syp009v1. [Full Text] [PDF] |
||||
![]() |
S. Lehtonen Phylogeny Estimation and Alignment via POY versus Clustal + PAUP*: A Response to Ogden and Rosenberg (2007) Syst Biol, August 1, 2008; 57(4): 653 - 657. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






