Skip Navigation

Systematic Biology 2008 57(1):160-166; doi:10.1080/10635150701884640
This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Heath, T. A.
Right arrow Articles by Hillis, D. M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Heath, T. A.
Right arrow Articles by Hillis, D. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 Society of Systematic Biologists

Taxon Sampling Affects Inferences of Macroevolutionary Processes from Phylogenetic Trees

Edited by Rod Page, Jack Sullivan

Tracy A. Heath1, Derrick J. Zwickl1,3, Junhyong Kim2 and David M. Hillis1

1 Section of Integrative Biology and Center for Computational Biology and Bioinformatics, University of Texas at Austin Austin, Texas 78712, USA; E-mail: tracyh{at}mail.utexas.edu (T.A.H.)
2 Department of Biology, University of Pennsylvania Philadelphia, Pennsylvania 19104, USA
3 National Evolutionary Synthesis Center Durham, North Carolina 27705, USA

Received May 7, 2007; Revised July 9, 2007; Accepted September 11, 2007 Phylogenetic relationships across the Tree of Life form the basis for comparing and organizing the Earth's biodiversity. In addition to providing information about the evolution of individual genes, populations, or species, phylogenetic trees are often used to study broader evolutionary patterns. In particular, the shape of phylogenetic trees (e.g., the distribution of cladogenic events across the tree) has been used to understand broad speciation and extinction patterns (Raup et al., 1973; Gould et al., 1977; Rosen, 1978; Savage, 1983; Mitter et al., 1988; Heard, 1992; Guyer and Slowinski, 1993; Mooers and Heard, 1997; Dodd et al., 1999; Good-Avila et al., 2006; Ricklefs, 2006). The results of many studies on phylogenetic tree shape suggest that variation in the rates of speciation and extinction has played an important role in shaping the Tree of Life. However, it remains to be determined to what extent we can detect the patterns resulting from the evolutionary processes that shape trees. These patterns can be obscured by nonbiological factors that can bias tree shape, such as incomplete taxon sampling (Mooers, 1995; Rannala et al., 1998; Pybus and Harvey, 2000; Purvis and Agapow, 2002; Huelsenbeck and Lander, 2003), phylogenetic reconstruction methods (Heard and Mooers, 1996; Huelsenbeck and Kirkpatrick, 1996), or phylogenetic noise (Mooers et al., 1995; Heard and Mooers, 1996; Stam, 2002). Therefore, it is important to understand how estimates of tree shapes might be biased as a result of nonbiological factors.

Tree shape often refers to either the distribution of branching times over the tree (using measures such as the {gamma}-statistic; Pybus and Harvey, 2000) or tree imbalance (Shao and Sokal, 1990; Kirkpatrick and Slatkin, 1993; Agapow and Purvis, 2002). Measures of tree imbalance (the focus of this study) assess the distribution of lineages over a tree topology and quantify the degree of asymmetry among the branches. These measures are often compared to the values expected under a null model of equal speciation/extinction rates over all lineages (the equal-rates Markov model or ERM model). Using a wide range of tree imbalance measures, many studies have found that published phylogenies reconstructed from empirical data are more imbalanced than predicted under the ERM model (Guyer and Slowinski, 1991; Heard, 1992; Mooers, 1995; Purvis and Agapow, 2002; Holman, 2005; Blum and François, 2006). An alternative to the ERM null model is the proportional-to-distinguishable arrangements (PDA) model (or uniform model). Under this model, every labeled tree topology is equally likely (Rosen, 1978). Trees generated under this model are on average more imbalanced than those generated under the ERM model, and studies have shown that the PDA model predicts more tree imbalance than what is observed in empirical phylogenies (Cunningham, 1995; Holman, 2005; Blum and François, 2006).

Numerous researchers have found that taxon sampling has a strong influence on the accuracy of phylogenetic reconstruction methods (Hendy and Penny, 1989; Hillis, 1996, 1998; Graybeal, 1998; Kim, 1998; Rannala, et al., 1998; Poe and Swofford, 1999; Pollack et al., 2002; Zwickl and Hillis, 2002; Hillis et al., 2003; Poe, 2003; DeBry, 2005; Hedtke et al., 2006). Taxon sampling also has an impact on the distribution of branching times and phylogenetic tree imbalance. Removing ingroup taxa creates longer terminal and/or internal branches compared to a phylogeny containing all extant lineages (Rannala et al., 1998; Huelsenbeck and Lander, 2003). In addition to the problems this effect produces for phylogenetic inference, it also can confound estimates of diversification rates, divergence times, rates of molecular evolution, and ancestral state reconstruction (Nee et al., 1994; Robinson et al., 1998; Ackerly, 2000; Pybus and Harvey, 2000; Salisbury and Kim, 2001; Pybus et al., 2002).

Studies investigating the influence of taxon sampling on tree imbalance have primarily surveyed published phylogenies. Mooers (1995) compiled 39 "full" phylogenies (e.g., trees missing no more than one taxon, where the taxa could be species or higher taxonomic groups), each consisting of 8 to 14 terminal taxa. He compared the imbalance of the full trees to the imbalance in a collection of 82 incomplete phylogenies obtained from a study by Heard (1992). This comparison showed that incomplete trees are more imbalanced than trees comprised of almost all of the members of the group in question. In another study, Purvis and Agapow (2002) collected 61 phylogenies of superspecific taxa and showed that tree imbalance is, on average, greater when the terminal taxa are higher level taxonomic units than when they are species. It has been suggested that the change in tree imbalance that results from sparse taxon sampling might be due in part to the nonrandom way in which systematists sample taxa, and that a truly random selection of taxa may not bias tree imbalance (Guyer and Slowinski, 1991; Kirkpatrick and Slatkin, 1993; Mooers, 1995; Purvis and Agapow, 2002). Heard and Mooers (2002), however, used simulated tree topologies to show that random mass extinctions caused an increase in tree imbalance after a period of recovery if the speciation and extinction rates were allowed to vary.

In this study, we investigated the influence of varying levels of random taxon sampling on phylogenetic tree imbalance. We compared the patterns of imbalance found in recently published phylogenies with very low taxon sampling to the expectations of tree imbalance under different branching models and sampling levels. We show that the observed levels of tree imbalance in empirical studies are consistent with the expectations from simulations that include variable and autocorrelated rates of speciation and extinction combined with low levels of taxon sampling.


    Methods
 Top
 Methods
 Results and Discussion
 Conclusions
 Acknowledgments
 References
 
Simulations
We simulated non-ERM trees under a simple model of exponential waiting time for speciation/extinction events with variable lineage-specific speciation and extinction rates. Each tree started with a single root lineage and initial values for speciation and extinction rates. The time to the next event (lineage splitting or extinction) was drawn from an exponential distribution based on the sum of the rates for all extant lineages. The type and location of each event was chosen in proportion to the speciation and extinction rates for each of the extant lineages. When the next event resulted in extinction, the lineage was removed and a new waiting time was drawn. At a speciation event, the parent lineage bifurcated into two daughter lineages. The speciation/extinction rates of each daughter lineage were obtained by multiplying the parent rate by a random number (m). The value of m was drawn from a gamma distribution with a shape parameter ({alpha}) and scale parameter (β), where β = {alpha} so that E(m) = 1 and the rates were autocorrelated. We then enforced a gamma-distributed prior on speciation and extinction rates to discourage the rates from going to infinity or zero. Therefore, when the rate of a new daughter lineage was drawn, that rate was accepted in proportion to the gamma-distributed prior. The prior distributions on the rates were also assigned shape and scale parameters. These parameters were responsible for regulating much of the rate variation. We show by simulation that increasing the shape parameters results in a decrease in the diversification rate variation and produces more balanced topologies (Fig. 1). This model is a biologically motivated method for generating variable and autocorrelated speciation/extinction rates. Trees generated under this model should produce more biologically realistic tree topologies than the ERM or PDA models, because it is an empirical observation that speciation and extinction rates do vary across groups, and these rates are correlated among related species (Dial and Marzluff, 1989; Guyer and Slowinski, 1991; Heard, 1992; Sanderson and Donoghue, 1994; Savolainen et al., 2002; Holman, 2005). Our model for generating variable speciation/extinction rates is analogous to probabilistic models of the rate of molecular evolution implemented in methods used to estimate divergence times (e.g., Thorne et al., 1998; Huelsenbeck et al., 2000; Kishino et al., 2001).


Figure 1
View larger version (19K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1 The functional relationship between weighted mean imbalance and ln(node size) for four sets of trees simulated under a range of variance parameters. The parameter, alpha, of the gamma-distributed rate prior was changed for each set of simulations to 1, 3, 5, and 10 (for both the speciation rate and extinction rate). Increasing alpha decreases the amount of rate variation and, as a result, it also decreases the amount of nodal imbalance. In the case where alpha = infinity, the tree shapes should be identical to what is expected under the ERM model (equal rates Markov model; dashed line).

 
The above method for simulating tree topologies was implemented by modifying code from the program Phyl-o-gen (Rambaut, 2002; the modified code is available from the authors). We simulated sets of 500 trees each consisting of 10,000 terminal taxa under a range of parameters for the amount of rate variation. Sets of trees simulated across the range of parameter values showed very similar patterns of imbalance (Fig. 1). We also generated trees under constant speciation and extinction rates (ERM model) and the proportional-to-distinguishable arrangements (PDA) model.

Empirical Phylogenies
We collected trees from recently published studies of empirical data (online Appendix 1; http://systematicbiology.org). When surveying the literature, we selected trees from studies if their analyses included molecular data and used maximum likelihood, Bayesian, and/or maximum parsimony methods to infer the tree. When a study presented trees estimated using more than one data partition we selected the tree based on the combined analysis. When we encountered more than one study on a particular taxonomic group, we selected the most recently published tree. The trees in our collection of published phylogenies were then pruned of redundant species, and outgroups were removed so as not to increase the tree imbalance but retain the root position. Unlike previous studies using published phylogenies (Mooers, 1995; Purvis and Agapow, 2002; Holman, 2005), we only used trees with species as terminal taxa so that we could directly calculate the amount of species-level sampling and avoid subjective aspects of higher level taxonomic grouping. We determined the proportion of taxon sampling based on the number of described species in the group. Our estimates of the proportions of taxon sampling are necessarily dependent on the monophyly of the sampled groups and undiscovered biodiversity, but the overall results do not depend on the exact value of the sampled proportions. We then sorted the empirical phylogenies based on the proportion of taxon sampling and the method used to reconstruct the tree. In this study, we only present the imbalance of phylogenies with sampling densities lower than 10% because our collection of published studies contained relatively few trees with more complete species sampling.

Measure of Imbalance
We calculated the imbalance of simulated and empirical topologies using the imbalance measure first introduced by Fusco and Cronk (1995) and later modified by Purvis et al. (2002). Fusco and Cronk (1995) imbalance is calculated for an individual node such that


Formula

where for a given node with S extant descendants, B is the number of terminal taxa descended from the larger daughter lineage and m = S/2 (rounded up to the next integer if S is odd). For any node with more than three descendants, I has a maximum value of 1 for a node that is completely imbalanced (B = S –1), and a minimum value of 0 for a node where each daughter lineage has the same number of descendants (or differing by 1 if S is odd). One property of this imbalance measure is that the expected value of I under the ERM model depends on whether S is even or odd (Purvis et al., 2002). Therefore, Purvis et al. (2002) introduced a set of weights (w) to calculate an expected weighted mean of I (Iw) so that the measure has an expected value of 0.5 for all node sizes under equal rates:


Formula

For a single node, Iw is the product of I and w divided by the mean of the node weights across the entire tree (Purvis et al., 2002; Purvis and Agapow, 2002). Using these weights, the imbalance for a collection of nodes can also be measured by calculating the weighted mean of I (Purvis et al., 2002; Holman, 2005).

Unlike many other measures of tree imbalance (for examples see Agapow and Purvis, 2002), Iw does not require fully resolved topologies (because the imbalance at multifurcating nodes is not measured), nor is it dependent on the size of the tree. Additionally, Iw can be used to evaluate the imbalance of a collection of trees to assess the relationship between imbalance and node size (Holman, 2005) and compare unique sets of trees to detect differences in macroevolutionary patterns (assuming that there is homogeneity across a set of trees). For each set of trees, the bifurcating nodes with more than three descendants were binned according to the natural log of node size, ln(S) in intervals of 0.5, and the weighted mean imbalance for the nodes in each bin was calculated (see Holman, 2005). Although this measure of imbalance was developed for complete trees, or phylogenies of higher level taxonomic groups incorporating species richness data, in this study, we use Iw to determine the impact of reduced species sampling by comparing the imbalance of complete trees with that of incomplete trees.


    Results and Discussion
 Top
 Methods
 Results and Discussion
 Conclusions
 Acknowledgments
 References
 
The Effect of Node Size on Tree Imbalance
The nodal weighted mean imbalance for the empirical trees is summarized in Fig. 2. We observed a pattern of imbalance in empirical trees similar to that reported by Holman (2005), with imbalance increasing as node size increases. A recent study by McPeek and Brown (2007) offers a plausible biological explanation for this positive correlation between node size and imbalance. They observed that clade size increases with clade age; therefore, larger nodes are typically older nodes and their descendant lineages have had more time to experience the pressures that may cause shifts in diversification rates. This implies that there is also a positive association between node age and imbalance.


Figure 2
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2 The weighted mean imbalance of empirical trees plotted as a function of the natural log of the node size (S). The dashed line at 0.5 indicates the imbalance expected under the ERM model. One hundred and twenty-four trees reconstructed using maximum parsimony (MP) are indicated by the dotted line with black triangles and 107 trees reconstructed by maximum likelihood or Bayesian methods (ML/B) are represented by the solid line and white triangles.

 
For nodes with fewer than 140 descendants, we did not detect a significant difference in the pattern of imbalance between trees reconstructed under maximum parsimony versus those reconstructed using parametric methods (Fig. 2). Although there appear to be somewhat greater differences in the imbalance at larger nodes, these differences are largely attributable to the smaller number of observations in those categories. Therefore, we combined the trees into a single set of empirical phylogenies for our subsequent analyses. When combining the trees, if a single paper presented both a parsimony tree and a maximum likelihood or Bayesian tree, we selected the tree at random. This combined collection of trees consisted of 77 parsimony trees and 78 maximum likelihood/Bayesian trees.

Figure 3 shows the weighted mean imbalance of our combined collection of empirical trees and a set of trees simulated under our model of varying speciation and extinction rates (where {alpha} = 2 for the gamma-distributed rate priors for both speciation and extinction rates). We also show the imbalance expected under the ERM and PDA models. Although we used a different collection of empirical trees than used in previous studies (Purvis and Agapow, 2002; Holman, 2005; Blum and François, 2006), our results are similar to those found by Holman (2005) and Blum and François (2006). Specifically, the PDA and ERM models do not adequately represent the imbalance found in empirical phylogenies (Fig. 3). The trees simulated under our model of speciation and extinction rate variation, however, have nodal imbalance that is more representative of empirical phylogenies than the ERM model and are much less imbalanced than trees generated under the PDA model. As with the empirical observations of McPeek and Brown (2007), trees generated under our model show a positive association between node size and node age, as well as a positive correlation between node age and imbalance.


Figure 3
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3 The nodal imbalance for the combined collection of empirical trees (triangles; 157 total trees) and the collection of trees simulated under varying rates of speciation and extinction (circles). The upper dotted line represents the imbalance expected for trees generated under the PDA model (proportional-to-distinguishable arrangements model) and the dashed line at 0.5 indicates the imbalance expected under the ERM model.

 
The Effect of Reduced Taxon Sampling on Tree Imbalance
Unlike some of the previous surveys of tree imbalance (Mooers, 1995; Purvis and Agapow, 2002; Holman, 2005), our collection of empirical trees all had low percentages of sampled taxa because we treated the tips as individual species instead of considering higher taxonomic rank with species richness information. The empirical trees presented in this study all had less than 10% of the described species represented in the phylogeny (with a median of ~ 2%). When we randomly pruned taxa from the trees simulated with variable and autocorrelated speciation/extinction rates, we observed an increase in nodal imbalance and a very good approximation of the imbalance found in the empirical trees (Fig. 4). In contrast, we show that for trees simulated under the ERM and PDA models, random taxon sampling does not alter the functional relationship between imbalance and node size (Fig. 5). This result was also demonstrated by Heard and Mooers (2002), who showed that random mass extinctions of ERM topologies did not affect tree imbalance after a period of recovery under constant diversification rates.


Figure 4
View larger version (24K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4 Weighted mean imbalance for empirical trees (dotted line/triangles) and trees simulated under varying rates with different levels of taxon sampling (solid line/circles). The simulated trees were reduced to 3% and 1% taxon sampling. The dashed line at 0.5 indicates the imbalance expected for trees generated under the ERM model.

 


Figure 5
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 5 Weighted mean imbalance as a function of the natural log of the node size for trees simulated under the PDA model (black), the ERM model (black), and variable rates model (gray). The sets of trees with 100% taxon sampling are indicated by dashed lines. Sets of trees with 3% taxon sampling are represented by the solid lines. These simulations indicate that random taxon sampling of trees generated either by the PDA model or the ERM model does not result in a change in the relationship between imbalance and node size, whereas there is a strong taxon-sampling effect for the variable rates model.

 
We randomly pruned 50% of the taxa from trees in our combined set of empirical phylogenies to determine whether or not an additional reduction in taxon sampling would increase the imbalance in empirical phylogenies (Fig. 6). The results shown in Fig. 6 are from 100 replicates of randomized pruning and suggest that, on average, random reduced taxon sampling does indeed increase the imbalance in these trees.


Figure 6
View larger version (20K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 6 The weighted mean imbalance of empirical trees with reduced taxon sampling. The imbalance of the published phylogenies without a reduction in taxon sampling is represented by the solid line. The dotted line indicates the same set of trees with a 50% reduction in taxa averaged over 100 replicates with standard error bars. The dashed line at 0.5 indicates the imbalance expected under the ERM model.

 
Our results indicate that incomplete taxon sampling in the presence of diversification rate variation may be sufficient to explain much of the imbalance observed in our collection of empirical trees, because as species are removed from a phylogeny, the apparent variation in the rates of diversification is increased. Our simulations show that older nodes are, on average, more imbalanced than younger nodes. Therefore, pruning taxa from these trees results in an increase in the average age of the internal nodes and, additionally, removal of terminal branches increases the average imbalance for nodes of a given size. However, it remains unclear exactly how much reduced taxon sampling biases tree imbalance. The published phylogenies used in this study most likely do not contain random samples of taxa, so it is difficult to determine the relative influence of biased taxon sampling versus random sampling on tree imbalance. Because so many factors influence whether or not a species is included, it is difficult to emulate the way in which systematists sample taxa. Using a simple model of biased taxon sampling, however, Mooers (1995) was able to show that nonrandom exclusion of terminal lineages can increase the imbalance of ERM trees. More investigation into the impact of biased taxon omission on phylogenetic tree shape and tree reconstruction is required.

When incomplete species sampling is taken into account, the model for varying speciation and extinction rates presented in this paper is a better representation of the tree shapes observed in published phylogenies than the ERM model or the PDA model. However, it is a parametric, stochastic model and not based on detailed biological processes. Our model does not attempt to capture all of the biological and environmental factors by which diversification rates vary over the course of evolution. Although the specific values of parameters in our model can be adjusted to produce varying levels of tree imbalance (Fig. 1), the general conclusions of our simulations remain consistent across a wide range of parameter values. Our simulations demonstrate that it is important to consider the interaction between diversification rate variation and reduced taxon sampling when assessing the shapes of empirical phylogenies (Fig. 4). Inferences of macroevolutionary processes based on incomplete phylogenies should be interpreted with caution and, when available, information on species diversity should be included in the calculation of Iw (Fusco and Cronk, 1995). This may result in a less biased estimate of tree imbalance even without relatively complete taxon sampling.


    Conclusions
 Top
 Methods
 Results and Discussion
 Conclusions
 Acknowledgments
 References
 
Variation in the relative rates of speciation and extinction produces tree topologies with greater imbalance than trees generated under the equal rates model (Fig. 3). Removal of taxa from trees generated under variable and autocorrelated rates results in a disproportionate representation of older divergences and increases the apparent variation in diversification rates among the lineages on the tree. Consequently, reduced taxon sampling causes an increase in tree imbalance (Fig. 4), which, in turn, may mislead analyses using tree shape to detect shifts in diversification rates.

It is also important to note that there are other non-biological factors that can contribute to imbalance in empirical phylogenies. Methods of phylogenetic reconstruction have been shown to be biased toward imbalanced trees (Huelsenbeck and Kirkpatrick, 1996), at least for trees of few taxa. Additionally, incorrect rooting of the tree can result in a more imbalanced topology. These factors may make it very difficult to tease apart the biological processes that contribute to tree imbalance.

It will be important to understand and account for these nonbiological contributors to tree imbalance if tree shape is to be used to study large-scale patterns of diversification. However, it is clear that in addition to producing more accurate estimates of phylogenetic relationships, increased taxon sampling also improves inferences about macroevolutionary events based on phylogenetic tree shape. As more complex and realistic models of diversification rate variation are developed, we will improve our understanding of the macroevolutionary forces that shape the Tree of Life. In addition, as phylogenetic reconstruction programs become capable of handling larger data sets (e.g., Stamatakis, 2006; Zwickl, 2006), models of complex branching processes can be used to generate model tree topologies for large-scale simulation studies on these new algorithms.


    Acknowledgments
 Top
 Methods
 Results and Discussion
 Conclusions
 Acknowledgments
 References
 
We thank Vincent Savolainen, Rod Page, Arne Mooers, and an anonymous reviewer as well as Mike Steel, members of the CIPRES project, the UT-IGERT discussion group, and members of the Hillis/Bull/Cannatella lab groups for helpful comments and advice. Financial support for this study was provided by the National Science Foundation (NSF EF 0331453 to the University of Texas and NSF EF 0331654 to the University of New Mexico). T.A.H. was funded by a graduate research traineeship provided by an NSF IGERT grant in Computational Phylogenetics and Applications to Biology awarded to the University of Texas, Austin. Computational resources were provided by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin (http://www.tacc.utexas.edu).


    References
 Top
 Methods
 Results and Discussion
 Conclusions
 Acknowledgments
 References
 

    Ackerly D. D. Taxon sampling, correlated evolution, and independent contrasts. Evolution (2000) 54:1480–1492.[Web of Science][Medline]

    Agapow P. M., Purvis A. Power of eight tree shape statistics to detect nonrandom diversification: A comparison by simulation of two models of cladogenesis. Syst. Biol. (2002) 51:866–872.[Abstract/Free Full Text]

    Blum M. G. B., Francios O. Which random processes describe the Tree of Life? A large-scale study of phylogenetic tree imbalance. Syst. Biol. (2006) 55:685–691.[Free Full Text]

    Cunningham S. A. Problems with null models in the study of phylogenetic radiation. Evolution (1995) 49:1292–1294.[CrossRef][Web of Science]

    DeBry R. W. The systematic component of phylogenetic error as a function of taxonomic sampling under parsimony. Syst. Biol. (2005) 54:432–440.[Abstract/Free Full Text]

    Dial K. P., Marzluff J. M. Nonrandom diversification within taxonomic assemblages. Syst. Zool. (1989) 38:26–37.[Abstract/Free Full Text]

    Dodd M. E., Silvertown J., Chase M. W. Phylogenetic analysis of trait evolution and species diversity variation among angiosperm families. Evolution (1999) 53:732–744.[CrossRef][Web of Science]

    Fusco G., Cronk Q. C. B. A new method for evaluating the shape of large phylogenies. J. Theor. Biol. (1995) 175:235–243.[CrossRef][Web of Science]

    Good-Avila S. V., Souza V., Gaut B. S., Eguiarte L. E. Timing and rate of speciation in Agave (Agavaceae). Proc. Natl. Acad. Sci. USA (2006) 103:9124–9129.[Abstract/Free Full Text]

    Gould S. J., Raup D. M., Sepowski J. J., Schopf T. J. M. The shape of evolution: A comparison of real and random clades. Paleobiology (1977) 3:23–40.[Abstract]

    Graybeal A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. (1998) 47:9–17.

    Guyer C., Slowinski J. B. Comparisons of observed phylogenetic topologies with null expectations among 3 monophyletic lineages. Evolution (1991) 45:340–350.[CrossRef][Web of Science]

    Guyer C., Slowinski J. B. Adaptive radiation and the topology of large phylogenies. Evolution (1993) 47:253–263.[CrossRef][Web of Science]

    Heard S. B. Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees. Evolution (1992) 46:1818–1826.[CrossRef][Web of Science]

    Heard S. B., Mooers A. O. Imperfect information and the balance of cladograms and phenograms. Syst. Biol. (1996) 45:115–118.[Free Full Text]

    Heard S. B., Mooers A. O. Signatures of random and selective mass extinctions in phylogenetic tree balance. Syst. Biol. (2002) 51:889–897.[Abstract/Free Full Text]

    Hedtke S. M., Townsend T. M., Hillis D. M. Resolution of phylogenetic conflict in large data sets by increased taxon sampling. Syst. Biol. (2006) 55:522–529.[Free Full Text]

    Hendy M. D., Penny D. A framework for the quantitative study of evolutionary trees. Syst. Zool. (1989) 38:297–309.[Abstract/Free Full Text]

    Hillis D. M. Inferring complex phylogenies. Nature (1996) 383:130–131.[CrossRef][Medline]

    Hillis D. M. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst. Biol. (1998) 47:3–8.[Free Full Text]

    Hillis D. M., Pollock D. D., McGuire J. A., Zwickl D. J. Is sparse taxon sampling a problem for phylogenetic inference? Syst. Biol. (2003) 52:124–126.

    Holman E. W. Nodes in phylogenetic trees: The relation between imbalance and number of descendent species. Syst. Biol. (2005) 54:895–899.[Abstract/Free Full Text]

    Huelsenbeck J. P., Kirkpatrick M. Do phylogenetic methods produce trees with biased shapes? Evolution (1996) 50:1418–1424.[CrossRef][Web of Science]

    Huelsenbeck J. P., Lander K. M. Frequent inconsistency of parsimony under a simple model of cladogenesis. Syst. Biol. (2003) 52:641–648.[Abstract/Free Full Text]

    Huelsenbeck J. P., Larget B., Swofford D. A compound Poisson process for relaxing the molecular clock. Genetics (2000) 154:1879–1892.[Abstract/Free Full Text]

    Kim J. Large-scale phylogenies and measuring the performance of phylogenetic estimators. Syst. Biol. (1998) 47:43–60.[Abstract/Free Full Text]

    Kirkpatrick M., Slatkin M. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution (1993) 47:1171–1181.[CrossRef][Web of Science]

    Kishino H., Thorne J. L., Bruno W. J. Performance of a divergence time estimation method under a probabilistic model of rate evolution. Mol. Biol. Evol. (2001) 18:352–361.[Abstract/Free Full Text]

    McPeek M. A., Brown J. M. Clade age and not diversification rate explains species richness among animal taxa. Am. Nat. (2007) 169:E97–E106.[CrossRef][Web of Science][Medline]

    Mitter C., Farrell B., Wiegmann B. The phylogenetic study of adaptive zones—Has phytophagy promoted insect diversification. Am. Nat. (1988) 132:107–128.[CrossRef][Web of Science]

    Mooers A. O. Tree balance and tree completeness. Evolution (1995) 49:379–384.[CrossRef][Web of Science]

    Mooers A. O., Heard S. B. Evolutionary process from phylogenetic tree shape. Q. Rev. Biol. (1997) 72:31–54.[CrossRef]

    Mooers A. O., Page R. D. M., Purvis A., Harvey P. H. Phylogenetic noise leads to unbalanced cladistic tree reconstructions. Syst. Biol. (1995) 44:332–342.[Abstract/Free Full Text]

    Nee S., Holmes E. C., May R. M., Harvey P. H. Extinction rates can be estimated from molecular phylogenies. Philos. Trans. R Soc. Lond. B Biol. Sci. (1994) 344:77–82.[Abstract/Free Full Text]

    Poe S. Evaluation of the strategy of long-branch subdivision to improve the accuracy of phylogenetic methods. Syst. Biol. (2003) 52:423–428.[Free Full Text]

    Poe S., Swofford D. L. Taxon sampling revisited. Nature (1999) 398:299–300.

    Pollock D. D., Zwickl D. J., McGuire J. A., Hillis D. M. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. (2002) 51:664–671.[Free Full Text]

    Purvis A., Agapow P. M. Phylogeny imbalance: Taxonomic level matters. Syst. Biol. (2002) 51:844–854.[Abstract/Free Full Text]

    Purvis A., Katzourakis A., Agapow P. M. Evaluating phylogenetic tree shape: Two modifications to Fusco & Cronk's method. J. Theor. Biol. (2002) 214:99–103.[CrossRef][Web of Science][Medline]

    Pybus O. G., Harvey P. H. Testing macro-evolutionary models using incomplete molecular phylogenies. Proc. Roy. Soc. B (2000) 267:2267–2272.[Medline]

    Pybus O. G., Rambaut A., Holmes E. C., Harvey P. H. New inferences from tree shape: Numbers of missing taxa and population growth rates. Syst. Biol. (2002) 51:881–888.[Abstract/Free Full Text]

    Rambaut A. Phyl-o-gen: Phylogenetic tree simulator package v1.1 (2002) http://evolve.zoo.ox.ac.uk/software.html?id=phylogen.

    Rannala B., Huelsenbeck J. P., Yang Z., Nielsen R. Taxon sampling and the accuracy of large phylogenies. Syst. Biol. (1998) 47:702–710.[Free Full Text]

    Raup D. M., Gould S. J., Schopf T. J. M., Simberloff D. S. Stochastic-models of phylogeny and the evolution of diversity. J. Geol. (1973) 81:525–542.[Web of Science]

    Ricklefs R. E. Global variation in the diversification rate of passerine birds. Ecology (2006) 87:2468–2478.[Web of Science][Medline]

    Robinson M., Gouy M., Gautier C., Mouchiroud D. Sensitivity of the relative-rate test to taxonomic sampling. Mol. Biol. Evol. (1998) 15:1091–1098.[Abstract]

    Rosen D. E. Vicariant patterns and historical explanation in biogeography. Syst. Zool. (1978) 27:159–188.[Abstract/Free Full Text]

    Salisbury B. A., Kim J. Ancestral state estimation and taxon sampling density. Syst. Biol. (2001) 50:557–564.[Abstract/Free Full Text]

    Sanderson M. J., Donoghue M. J. Shifts in diversification rate with the origin of angiosperms. Science (1994) 264:1590–1593.[Abstract/Free Full Text]

    Savage H. M. The shape of evolution—Systematic tree topology. Biol. J. Linn. Soc. (1983) 20:225–244.[CrossRef][Web of Science]

    Savolainen V., Heard S. B., Powell M. P., Davies T. J., Mooers A. O. Is cladogenesis heritable? Syst. Biol. (2002) 51:835–843.[Abstract/Free Full Text]

    Shao K. T., Sokal R. R. Tree balance. Syst. Zool. (1990) 39:266–276.[Abstract/Free Full Text]

    Stam E. Does imbalance in phylogenies reflect only bias? Evolution (2002) 56:1292–1295.[Web of Science][Medline]

    Stamatakis A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics (2006) 22:2688–2690.[Abstract/Free Full Text]

    Thorne J. L., Kishino H., Painter I. S. Estimating the rate of evolution of the rate of molecular evolution. Mol. Biol. Evol. (1998) 15:1647–1657.[Abstract]

    Zwickl D. J. Genetic algorithm approaches for the phylogenetic analyses of large biological sequence datasets under the maximum likelihood criterion (2006) The University of Texas at Austin. Ph.D. dissertation. Available at www.bio.utexas.edu/faculty/antisense/garli/Garli.html.

    Zwickl D. J., Hillis D. M. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. (2002) 51:588–598.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
J. E. Decker, J. C. Pires, G. C. Conant, S. D. McKay, M. P. Heaton, K. Chen, A. Cooper, J. Vilkki, C. M. Seabury, A. R. Caetano, et al.
Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics
PNAS, November 3, 2009; 106(44): 18644 - 18649.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. R. Vieites, K. C. Wollenberg, F. Andreone, J. Kohler, F. Glaw, and M. Vences
From the Cover: Vast underestimation of Madagascar's biodiversity evidenced by an integrative amphibian inventory
PNAS, May 19, 2009; 106(20): 8267 - 8272.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. L. Hughes and R. Friedman
Genome Size Reduction in the Chicken Has Involved Massive Loss of Ancestral Protein-Coding Genes
Mol. Biol. Evol., December 1, 2008; 25(12): 2681 - 2688.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Heath, T. A.
Right arrow Articles by Hillis, D. M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Heath, T. A.
Right arrow Articles by Hillis, D. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?