© 2008 Society of Systematic Biologists
The Taming of the Skew: Estimating Proper Confidence Intervals for Divergence Dates
Edited by Tim Collins
1 The College of Staten Island/CUNY, Department of Biology 6S-143, 2800 Victory Boulevard, Staten Island, NY 10314, USA; Email: burbrink{at}mail.csi.cuny.edu (F.T.B.); rpyron{at}gc.cuny.edu (R.A.P.)
2 The Graduate Center, CUNY, Biology Program 365 Fifth Avenue, New York, NY 10016
Received September 12, 2007; Revised November 9, 2007; Accepted December 20, 2007 Estimating divergence dates from molecular phylogenies permits researchers to test evolutionary hypotheses that cannot be explored using the fossil record alone. For example, estimating the age of nodes on a tree aids in testing the possibility of dispersal versus vicariance, assessing the origin and survival of groups across major geological time periods, and inferring times of divergence and rates of molecular evolution (Barker et al., 2004; Simmons, 2005; Teeling et al., 2005; Bossuyt et al., 2006), hypotheses that could not be credibly examined without statistically sound methods of date inference using DNA sequences and phylogenetic trees. The formal and strict definition of rate constancy and a molecular clock as proposed by Zuckerkandl and Pauling (1962) may not be valid in many situations and across a broad spectrum of taxa and genes (Wu and Li, 1985; Britten, 1986; Sanderson, 2002; Thorne and Kishino, 2006). Therefore, methods that can infer divergence dates utilizing relaxed-clock algorithms have grown in popularity. Two leading methods of statistical inference are commonly applied that do not require clock-like behavior from the data; relaxed clocks using Bayesian inference (BI) or penalized likelihood (PL; Sanderson, 2002, 2003; Thorne et al., 1998; Thorne and Kishino, 2002). Though BI methods have some measures of error estimation built into the date inference, PL methods lack a simple protocol for assessing uncertainty in the estimation of divergence dates.
For any method of statistical inference, it is necessary to have some measure of error in order to assess confidence. This is particularly important for divergence date estimation, where calculated uncertainty may be very large, and thus include the null hypothesis. There are three primary methods of calculating confidence intervals for a likelihood function: (i) the likelihood profile, which summarizes the likelihood shape in terms of the curvature of the likelihood surface at the maximum likelihood estimate (MLE) using Fisher Information; (ii) the bootstrap, which produces a frequency distribution of samples with the shape of the likelihood function; and (iii) the Bayesian credible interval, which combines maximum likelihood estimation with prior information on parameters (Clark, 2007). These methods may all be used to produce error estimates for divergence dates using PL (first two methods) or BI (last method).
The first of the two widely used methods of divergence dating is Bayesian inference with a relaxed clock, implemented in the programs MultiDivTime (Thorne and Kishino, 2002) and BEAST (Drummond and Rambaut, 2003). Bayesian methods without an expectation of constant substitution rates use a stochastic model for inferring rates of change along a branch while applying calibration reference priors that are external to the topology and sequence data (see Thorne and Kishino, 2006, for a detailed description of these methods). All Bayesian methods have error and credibility assessment built into the Markov chain Monte Carlo function. Therefore, the estimation of credible intervals is built into the method.
The other commonly used method is penalized likelihood (as implemented in the program r8s; Sanderson, 2003, 2004), which maximizes the sequence data set (X) on a combination of average rates (R) and time (T) with a penalty function to discourage rate change. This approximation of divergence dates in the absence of a molecular clock with minimum and maximum calibration references obtained from fossil or geological information follows the form:
|
| (1) |
is a smoothing parameter intended to control among-lineage rate variation and
(R) is a roughness penalty penalizing rate heterogeneity. Together, these terms comprise the penalty criterion term, which maximizes the penalized likelihood. To assess uncertainty in these divergence dates, branch lengths (BLs) are nonparametrically bootstrapped and a distribution of dates and rates of evolution is produced using Equation (1) for all BL pseudoreplicates (Clement et al., 2004; Yesson and Culham, 2006; Thorne and Kishino, 2006). Using a method introduced by Efron (1979), Felsenstein (1985) developed an application of the nonparametric bootstrap to systematics for estimating statistical confidence for nodes on a phylogenetic tree or, conversely, the uncertainty associated with a given phylogenetic inference. In the spirit of Efron (1979) and Felsenstein (1985), Sanderson (2004) suggests a method by which nonparametric bootstrapping can be employed to examine error in divergence time estimation. As in Felsenstein (1985), this methodology requires that a large number of pseudoreplicated data sets be constructed, with replacement, from the original DNA data matrix. However, the topology of interest and the model of evolution used to infer the tree are enforced as inviolable constraints, leaving only variation in branch length among all pseudoreplicated topologies. In this manner, a distribution of branch lengths is obtained from these topologies, which theoretically represents the uncertainty in the estimation of expected branch length of the tree given the topology and original sequence alignment. These pseudoreplicates with variable branch lengths are used to obtain a distribution of inferred dates calculated by penalized likelihood (Sanderson 2002, 2003) from which estimates of uncertainty and confidence intervals may be inferred (but see Thorne and Kishino, 2006). In the case of divergence date estimation, the bootstrap distribution may not actually contain the true parameter of interest (the date) because incorrect inferences can be made at several stages in the processes of producing a date, including inference of tree topology, branch length, the placement of calibration references, and fossil dating (Sanderson and Doyle, 2001; Magallón and Sanderson, 2001; Pereira and Baker, 2006). In spite of these possible errors, statistics using measures of central tendency are typically applied to produce confidence intervals from the distribution of nonparametric bootstrap dates.
Since Efron (1979) developed the nonparametric bootstrap to address uncertainty in the estimation of a parameter to avoid making incorrect inferences, a number of methods have been produced to examine confidence intervals for bootstraps. These methods are particularly crucial when the true distribution of a value is unknown and provide confidence intervals around a MLE. It is expected given these confidence intervals from a bootstrap distribution that the parameter of interest is found in the interval between two numbers with an accompanying probability. Again, this may or may not contain the true parameter given putatively incorrect inferences (see above) made to produce the pseudoreplicates. Two major categories have been developed to infer confidence intervals from a bootstrap distribution: pivotal and nonpivotal methods. The pivotal methods calculate quantiles (usually 95%) on bootstrap distributions that replace those of Gaussian or Student's t distributions (Efron, 1981; Hall, 1992; Efron and Tibshirani, 1993; Carpenter and Bithell, 2000). Therefore, estimation of confidence intervals using asymptotical pivotal methods does not require knowledge of µ or
2. The family of nonpivotal confidence intervals are more complicated and require knowledge of these parameters from the original sample.
Estimation of confidence intervals for PL divergence dates in most studies use the normal standard deviation of the bootstrap samples, which is in fact the standard error of the census population (Barker et al., 2004; Clement et al., 2004). This measure of population mean error is known with certainty and consists only of the distribution of bootstrap replicates (DiCiccio and Efron, 1996). Standard error of a bootstrap distribution for divergence dating can be estimated using:
|
| (2) |
Bi represents an estimate of time for a single node from a single bootstrap pseudoreplicate and
represents the mean date for the same node from all bootstrap pseudoreplicates.
The simplest expression of a confidence interval around a point estimate or mean estimate of a date (
) for the same node from the original data set and tree can be generated using the basic or normal percentile confidence interval
|
| (3) |
) represents the level of confidence obtained from a standard normal table (usually 95% = 1.96) and
B represents the standard deviation from the distribution of nonparametric bootstrap pseudoreplicates. Presumably, the majority of papers have used the standard normal interval to calculate the 95% CI (Barker et al., 2004; Clement et al., 2004; Near, 2004). A few authors have incorrectly used the method of standard error calculation representing the standard deviation of means, by dividing Equation (2) byEquation (3) works well if the distribution is not skewed (Davison and Hinkley, 1997) and is aymptotically approximated (DiCiccio and Efron, 1996). It is also possible to estimate the central 95% of the bootstrap distribution by discarding upper and lower quantiles (i.e., 2.5% and 97.5%). This method is implemented in the "Profile" command in the PL program r8s (Sanderson, 2004). However, this method falls victim to problems similar to those found when using the standard normal interval, as will be seen. If the bootstrap distribution is skewed and not asymptotically approximated, then it is recommended that another pivotal statistic, bootstrap t, or two nonpivotal statistics, bias corrected and accelerated (BCa) or approximate bootstrap confidence quadratic (ABCq) methods, be used. The BCa method is in fact a quantile method (DiCiccio and Efron, 1996), which incorporates the information about skew and bias inherent in the bootstrap distribution to correct the quantiles chosen for accuracy. The ABCq method is the analytical approximation of the BCa method, allowing for quick and accurate approximations of the bias-corrected quantile estimates. In many instances, bootstrap estimation of divergence dates using PL will be skewed with an asymptotic approximation that difficult to determine. In the following section we give a brief overview of these methods used to produce confidence intervals from a bootstrap distribution of branch lengths and then provide an example using these methods.
The bootstrap t method, uses, by analogy, a Student's t statistic:
|
| (4) |
|
| (5) |
) is a percentage of the t distribution (DiCiccio and Efron, 1996). In this case, the actual t distribution is not known, but rather estimated from the percentiles of T in Equation (4) by bootstrapping. Hall (1992) and DiCiccio and Efron (1992) demonstrated that the limits of these estimates are second-order correct; this indicates that the coverage probability (the
coverage level includes
) is on the order of
Bi to determine
Bi and one from the original data set to produce each
Bi (DiCiccio and Efron, 1996; Carpenter and Bithell, 2000). For estimation of the t distribution of 1000 pseudoreplicates, it would be necessary to create over one million nonparametric boostrap pseudoreplicates. Here, we approximate the bootstrap t distribution via the following equation:
|
| (6) |
Bi (see example).
The first of the two nonpivotal quantile methods, BCa, corrects median bias by adjusting the percentile interval to accomodate bias and skewness. This method requires an estimation of acceleration, which is similar to skewness. The upper end-point for a one-sided BCa interval for
B is
|
| (7) |
, z0 is the bias correction constant defined by
) is the standard cumulative distribution function of the probability for the end point of interest. In Equation (7) the acceleration constant (a) measures how quickly the standard error changes on the normalized scale and is asymptotically approximated here by:
|
| (8) |
|
| (9) |
|
| (10) |
|
| (11) |
|
| (12) |
B, as a function of µ and is defined by the derivative:
|
| (13) |
| Example using Bootstrap Confidence Intervals for Divergence Dates |
|---|
|
|
|---|
Here we provide two examples of the importance of properly calculating confidence intervals when testing hypotheses relating to origins by evaluating the impact of skewing on two major divergence dates within the Alethinophidian snakes. We provide an example illustrating that incorrect measures of confidence intervals when the nonparametric bootstrap distribution is skewed will influence the acceptance of a null hypothesis specific to estimating divergence dates. From the chronogram in Fig. 1, it is clear that several lineages of snakes originated near major geological boundaries including the Cretaceous-Tertiary (K-T; 65.5 Ma) and the transition between the Jurassic and Cretaceous (145.5 Ma), which marked major shifts in faunal composition (Raup and Sepkoski, 1984; Sheehan et al., 1991; Ocampo et al., 2006). We would like to infer, with a 95% confidence interval, whether certain groups had their origin prior to or directly after the K-T or Jurassic-Cretaceous boundaries.
Hypothesis 1. Did the Colubroidea originate before the K-T boundary? The snake superfamily Colubroidea is by far the most diverse snake group in the world containing more than 2500 species, 85% of all extant serpents (Lawson et al., 2005), including all venomous species. The ecological, evolutionary, and medical significance of this group cannot be overstated. It is suggested from the fossil record that this group originated after the Cretaceous-Tertiary (K-T) boundary, when it diverged from the Acrochordoidea (Rage, 1987; Nagy et al., 2003). This has yet to be substantiated with any margin of error from fossil calibrated phylogenies based on DNA sequences, even though some authors have suggested a late Cretaceous origin (Dowling et al., 1983) for the Colubroidea. If the 95% confidence intervals used to assess these dates overlap the K-T boundary, then the possibility of a Mesozoic divergence must be considered.
|
Hypothesis 2. Did the Alethinophidia originate during the Jurassic Period? The infraorder Alethinophidia comprises the vast majority of the living snakes, approximately 3000 species, including all venomous snakes and large constrictors. The divergence of the Alethinophidia from the blind snakes (infraorder Scolecophidia) is generally hypothesized to have occurred in the early Cretaceous, when the first indisputable snake, Lapparentophis defrennei, appears in the fossil record (Rage, 1987). We test whether a Jurassic (prior to 145.5 Ma) origin is plausible based on molecular evidence.
Note here that origin refers to the divergence of the named monophyletic ingroup from its sister taxon, in contrast to the diversification of a group (the most recent common ancestor of the extant representatives). In this manner, the nodes labeled "Colubroidea" and "Alethinophidia" (Fig. 1) indicate the divergence of those two groups from their sister taxa, the Acrochordidae and Scolecophidia, respectively.
The data used to infer trees and branch lengths here for divergence estimation are derived from GenBank sequences (online appendix, available at www.systematicbiology.org). A data set was constructed using the nuclear gene c-mos (573 bp) and the mitochondrial gene cytochrome b (1154bp) for 32 terminal taxa representing nearly all of the known families and subfamilies of snakes (Table 1). Two outgroups were used, a lizard as a sister lineage to the serpents and a sphenodontid as a sister group to the squamates (snakes and lizards; Pough et al., 2003). It should be noted that this data set is not comprehensive with regard to taxon sampling or genes utilized and is thus used merely for illustrative purposes to demonstrate the statistical techniques used in this study. To infer trees and assess tree support using models incorporating evolutionary information specific to codon position, we performed a partitioned analysis using Bayesian inference (BI) with MrBayes (version 3.1.2; Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003). Prior to tree inference, three evolutionary models were evaluated for each data set: (i) 6(GTR+
+I) accounts for different rates of evolution rates in each of the three codon positions of the cytochrome b and c-mos genes using the GTR+
+I model with estimated base pair frequencies (BF); (ii) 2(GTR+
+I) used only two partitions, one for each gene with estimated BF; and (iii) GTR+
+I simply applies this model with estimated BF across all positions with no partitioning among codon positions. The 6(GTR+
+I) model was chosen for the primary analysis over other models (PBF at least > 1467.92).
|
For each model, two independent searches were executed to ensure convergence of all parameters by comparing the variance across chains within a search to the chain variance among searches using Gelman and Rubin's "r" statistic (Gelman and Rubin, 1992). Searches were considered burned-in when the values for "r" have reached
1.00. All searches consisted of three "heated" and one "cold" Markov chain estimated for 20 million generations, with every 1000th sample being retained. Default priors were applied to all parameters, except branch length, which was drawn from an exponential distribution. A split standard deviation less than 0.005 for –lnL tree values among chains indicated that parameter stationarity was achieved. Trees prior to stationarity were discarded. The harmonic mean of the model likelihood, f(X|Mi), taken from the stationarity phase, was compared among different models using posterior Bayes factors (PBFs) for the equation 2 logeB10 (Newton and Raftery, 1994). A Bayes factor value greater than ten was considered as strong evidence favoring the more parameter-rich model (Kass and Raftery, 1995). The posterior probability tree was tested using Bayes factors and a likelihood-ratio test against a constrained tree forced to evolve in a clock-like manner (df = 30,
= 0.05,
2 = 43.773. Clock-like rates of evolution were rejected by both methods (LRT = 247.592; PBF = 1140.74). The semiparametric approach to estimating divergence using the penalized likelihood (PL) method with the truncated Newton (TN) algorithm of Sanderson (2002) as implemented in r8S version 1.70 (Sanderson, 2004) was used. Calibration references (Table 1) were placed on the BI tree and date estimates were derived from this tree. We determined the appropriate PL smoothing parameter using the cross-validation procedure among five values differing in a magnitude of 10 and ranging from 1 to 10,000. A log penalty function was also selected to penalize squared log differences between neighboring branches. The likelihood of solutions for different rates was examined using the CheckGradient feature. To obtain error estimates for divergence date inferences, we used the BI tree topology with 1000 nonparametric bootstrap pseudoreplicate estimates branch lengths. All branch length information using 1000 nonparametric pseudoreplicates were obtained from PAUP* v. 4.10 (Swofford, 2002) by the method of Sanderson (2004).
The program PAUP* v.4.10 does not allow partitioned analyses, so the unpartitioned GTR+
+ I model was used. It should be noted that all three models used in the BI analysis produced the same tree topology and equivalent branch lengths. Additionally, only the topology of the BI tree is enforced as a constraint and the actual distribution of divergence dates are based on reestimated branch lengths on this tree from PAUP* v.4.10 using the GTR+
+ I model. Likelihood scores for the given topology and branch lengths were estimated on the original sequence alignment, and the BI topology and ML parameters were enforced as constraints for bootstrapping. As maximum likelihood methods allowing for mixed and partitioned model analysis become available (i.e., the program RAxML; Stamatakis, 2006), the bias inherent in the underparameterized branch lengths should decrease, allowing for even more accurate estimation of confidence intervals.
We used seven fossil calibration references from the paleontological literature as constraints on the preferred BI topology for the PL analyses (Table 1). For these fossil dates, all but the root time represent the minimum age of these specific nodes. Early squamates are poorly represented in the fossil record. Lizards are known from the late Triassic, approximately 220 Ma (Datta and Ray, 2006), whereas the oldest known snakes are approximately 130 Ma (Rage, 1987). The oldest known fossil of the Sphenodontidae, the outgroup used in this study, dates to 228 Ma (Benton, 1990). Therefore, we fixed the age of the root of the tree at 228 Ma. Our methods of determining if the fossils represent appropriate calibration references are modeled on previous protocols (Near and Sanderson, 2004; Near et al., 2005) and are outlined in Burbrink and Lawson (2007). Fossils were removed one at a time, and dates for that node were reestimated using the other fossils. Fossil constraints were considered to be in agreement if the mean of the reestimations fell within one SD of the original estimate and were significantly correlated as a group.
We examined the 95% confidence interval for the bootstrap distribution of divergence dates to examine the two hypotheses concerning the origin of major snake lineages. All 95% confidence intervals were calculated using the basic standard normal, bootstrap t, BCa, and ABCq methods described above. To evaluate the suitability of substituting
for
Bi in the bootstrap t Equation (6), we examined the standard deviations (SD) of five additional double bootstrap replicate data sets of 1000 replicates. The mean SD of double bootstrap replicates was 4.46, versus 4.208 of the original SD (
) of the bootstrap distribution. The CV of the SD among the replicates was 0.0908. This indicates that
may be an adequate substitute for
Bi, yielding equivalent confidence intervals while requiring far less computational intensity. We used the attained significance level (ASL) to test for a pre-or post-K-T origin using the ABCq interval. The ASL was designed for the ABCq interval to represent the one-tailed alpha level for a hypothesized end-point (DiCiccio and Efron, 1996). Significant ASL end-point values are
(0.05, 0.95). Similarly, we also examined the Z-score to determine the equivalent P-values using the standard normal method. Additionally, we examined the right-left asymmetry around the point estimate (
) using the following shape formula from Diciccio and Efron (1996):
|
| (14) |
Completely symmetric distributions have a shape parameter of 1.00 and the standard normal intervals always produce this value because they assume a symmetric normal distribution a priori.
| Results and Discussion |
|---|
|
|
|---|
A distribution of divergence dates was produced at every node from the tree of extant snakes (Fig. 1) and all but one calibration reference fit within the dates predicted for them when cross-validated with other fossils (Table 2). The offending fossil, C3 (Table 1), fell slightly outside of the first SD margin, though this had little effect on the dates when values were reestimated in the absence of this constraint. The nonparametric bootstrap distribution of dates at every node is asymmetric. In contrast to a symmetric distribution with a shape of 1.00, it is clear that the Colubroidea are skewed towards older dates, yielding a shape value of 2.81 (Figs. 2 and 3; Table 3). The Alethinophidia are only slightly skewed with a shape of 0.827 (Figs. 2 and 4). At all nodes, the standard normal method that assumes the bootstrap distribution is symmetric with a shape of 1.00 will not accurately estimate the width of a 95% confidence interval. It then follows that the ability to determine if 95% of the time the interval contains the actual divergence date given the phylogenetic/chronological methods used to produce the distribution may be hampered when using the standard normal method.
|
Point and mean estimates from our analyses indicate that the Colubroidea diverged from the Acrochordoidea between the Selandian of the Middle Paleocene (58.7 Ma) and Ypresian of the Early Eocene (55.5 Ma; point and mean estimates, respectively; Fig. 3; Table 3). The upper confidence interval (UCI) using the standard normal method was 63.7 Ma, rejecting a pre K-T divergence with an ASL of
0.991. However, z0 was equal to 0.899, indicating a negative skew in the bootstrap distribution of dates at this node (Fig. 3). The UCI of the Bootstrap t, BCa, and ABCq intervals were 68.5, 75.1, and 71.1, respectively. The ASL of the ABCq interval for a hypothesized end-point of 65.5 Ma was 0.753, failing to reject a pre-K-T origin for the most diverse and widely distributed group of snakes. The shape of the ABCq interval was 2.81, extending approximately 1.8 times farther to the right than to the left.
|
|
|
The Alethinophidia appear to have diverged from the Scolecophidia approximately 133 Ma, during the Hauterivian of the early Cretaceous. The point estimate and the mean estimate for the divergence date from the nonparametric bootstrap distribution for the origin of this group were 132.89 and 134.38 Ma, respectively. The standard interval (95% of the bootstrap distribution of dates) yields a range of 118 to 151 Ma, which includes a Jurassic origin for this group (ASL = 0.908). Using the PL point estimate to calculate the standard interval produces an upper credible interval of 149.3 Ma. This interval does not take into account that the distribution is asymmetric (Figs. 2 and 4; Table 3); the bootstrap distribution is positively skewed relative to
For the hypotheses concerning the origin of two major snake groups, both the mean divergence dates from the bootstrap distribution and the point estimates indicate that each group would have originated in the era expected given their fossil record; the Colubroidea in the Tertiary and the Alethinophidia in the Cretaceous (Fig. 1). However, with all studies there should be a measure of uncertainty associated with the production of divergence dates when relying on non–clock-like molecular phylogenetic estimates and the proper placement of fossil calibration points. Although these systematic errors may be difficult to overcome or even assess, further compounding these problems by incorrectly estimating confidence intervals is undesirable. We have demonstrated that assuming the distribution of dates at a node of interest is symmetric may not include hypotheses concerning the maximum age of a group. For the most diverse groups of snakes, the standard normal confidence interval did not include the hypothesis that the Colubroidea originated prior to the K-T boundary. Although fossil evidence agrees with this conclusion, the skewed distribution of dates from these trees cannot rule out that this group did not originate prior to the K-T boundary and survived the cataclysm that effectively eliminated nearly 76% of life on Earth (Pope et al., 1998). According to the methods discussed here (bootstrap t, BCa, and ABCq), the date of origin of the Colubroidea is found in an interval that crosses the K-T boundary. Although the interpretations of Bayesian credible intervals are different, it should be noted that as Bayesian relaxed-clock models become more widely available, it is imperative that confidence intervals are reported using the proper calculations for the prior distributions for those methods as well. For instance, the relaxed-clock models proposed by Drummond et al. (2006) involve the use of uncorrelated log-normal or exponential prior distributions on node ages, and proper confidence intervals (such as the 95% highest posterior density given in the program Beast) should be calculated using the appropriate methods based on the resulting statistical distribution of the posterior estimate.
When compared to the standard normal distribution, each of these methods improves the estimation of the 95% CI for a skewed distribution. Recommending one of these methods for divergence dating estimates over all others may be difficult because most of these methods provide credible estimates, and all have been demonstrated to be second-order correct. However, the ABCq provides the benefits of compensation for skew while reducing computational burden (DiCiccio and Efron, 1996). With respect to computational costs and the possible problems with an overly wide estimation of the 95% CI, the bootstrap t is less desirable than the other two nonpivotal methods (DiCiccio and Efron, 1996). Among the two nonpivotal methods, ABCq gives the end-points directly as functions of the five parameters
,
,
,
, and
, which can be quickly estimated from a bootstrap data set. For the Colubroidea (Fig. 3), it appears that all of the methods that account for skewing give little coverage to the lower portion of the probability mass. This arises as an artifact of these methods that calculate confidence intervals using the PL point estimate in contrast to the standard normal method, which uses the mean of the bootstrap distribution to estimate intervals. Disparities between the use of the point estimate or mean of the bootstrap distribution contribute to the discrepancy in the lower confidence interval. In the case of the Alethinophidia, the difference between the point and the mean estimate is only 1.49 Ma out of estimates > 132 Ma (Fig. 4). For the Colubroidea, the difference is 3.2 Ma for estimates > 55 Ma, which affects the lower confidence intervals, particularly when using the ABCq and BCa methods. This is not necessarily a problem with the methods used to correct for skewing. Here, nonparametric bootstraps were calculated in order to produce confidence intervals around the PL point estimate. This value is, using all available sequence data, the best estimate of the date of divergence at a particular node. The bootstrap estimate of the mean is based on underparameterized estimates of the date and thus contains less information about the uncertainty in the distribution.
|
The discrepancy between the two estimates likely arises as a result of branch length replicates of zero value, which are dropped from analysis in the program r8s, as well as the constraint of even the shortest replicated branch lengths to satisfy a minimum age requirement, contributing to the primary skew of the overall distribution. This disparity between the point estimate and mean of the bootstrap distribution is, however, a crucial point. The purpose of the nonparametric methods is to generate confidence intervals for the MLE of a parameter, here the point estimate, from r8s, of a divergence time based on the best available model for the sequence data. The information about the variance in the original data set is extracted via nonparametric bootstrapping of the branch lengths, and this derived information is then used to develop confidence intervals for that maximum likelihood point estimate. This accounts for the deviation of the inferred confidence intervals from the primary mass of the bootstrap distribution and for the decreased utility of the quantile method; the purpose of calculating these intervals is not to highlight the 95% highest probability density of the bootstrap data set (as standard 95% quantiles would), but to use the information derived from the bootstrap data set to generate 95% confidence intervals around the best estimate of the mean. The use of the standard estimate or the 95% quantiles does not accurately answer the question posed by the researcher; i.e., "What interval accurately describes the uncertainty in the MLE estimate?" Some degree of skewing is always expected when inferring the distribution of dates using nonparametric bootstraps a posterior probability distribution of branch lengths. As has been suggested by other authors, a primary weakness of PL is that the use of multiple minimum constraints with only broad maxima tends to overestimate divergence dates (Hugall et al., 2007), leading to skewing towards older dates seen in many of our nodes. Uncalibrated trees would presumably not exhibit such skew; however, the addition of absolute bounds on some nodes will introduce asymmetry in the distribution of inferred dates, and this should be accounted for using proper methods. In the example presented here, the distribution of dates at every node examined for this phylogeny had some degree of skewing, from the moderate (0.82) to the extreme (2.81). It follows that this situation would not be unexpected in other studies estimating divergence dates. It is therefore advised that either a pivotal or nonpivotal method that accounts for skewing be used to correctly estimate confidence intervals in order to reliably address hypotheses that consider the divergence of a group of interest.
Accounting for skew is crucial for any study utilizing penalized likelihood methods for divergence time estimation; for accurate testing of any hypotheses, confidence intervals need to be generated for the PL point estimate, rather than the bootstrap distribution itself. In summary, the quantile method (i.e., the central 95% of the bootstrap distribution) does provide a confidence interval, of sorts, for the date estimate. However, this interval can be inflated and imprecise due to the inherently skewed nature of divergence time estimates. Stochasticity in the variance of rates of molecular evolution and uncertainty in the fossil record will most likely confound accurate inference of divergence time. However, this error should not be compounded by the use of insufficient statistics to quantify estimates of error, the one aspect that can be controlled by the researcher. The use of these nonparametric confidence intervals will allow researchers to minimize error attributable to incorrect assumptions about the underlying distribution of date estimates while allowing for the greatest accuracy and precision possible for inferring divergence times. In addition, the information provided about the skewness of the underlying data set may be of importance to researchers, as it reflects the behavior and variance of both the data and the method used to infer dates on a tree.
| Acknowledgment |
|---|
|
|
|---|
We would like to thank John J. Wiens for reviewing an early draft of this manuscript.
Both authors contributed equally to this manuscript.
| References |
|---|
|
|
|---|
-
Alvarez L. W., Alvarez W., Asaro F., Michel H. V. Extraterrestrial cause for the Cretaceous-Tertiary boundary extinction. Science (1980) 208:1095–1108.
Barker F. K., Cibois A., Schikler P., Feinstein J., Cracraft J. Phylogeny and diversification of the largest avian radiation. Proc. Natl. Acad. Sci. USA (2004) 101:11040–11045.
Benton M. J. Phylogeny of the major tetrapod groups: Morphological data and divergence dates. J. Mol. Evol. (1990) 30:409–424.[CrossRef][Web of Science][Medline]
Bossuyt F., Brown R. M., Hillis D. M., Cannatella D. C., Milinkovitch M. C. Phylogeny and biogeography of a cosmopolitan frog radiation: Late Cretaceous diversification resulted in continent-scale endemism in the family Ranidae. Syst. Biol. (2006) 55:579–594.
Britten R. J. Rates of DNA sequence evolution differ between taxonomic groups. Science (1986) 231:1393–1398.
Burbrink F. T., Lawson R. How and when did ratsnakes disperse into the New World from the Old World? Mol. Phylogenet. Evol. (2007) 43:173–189.[CrossRef]
Carpenter J. R., Bithell J. F. Bootstrap confidence intervals: When? which? what?—A practical guide for medical statisticians. Stat. Med. (2000) 19:1141–1164.[CrossRef][Web of Science][Medline]
Clark J. S. Models for ecological data. (2007) Princeton, New Jersey: Princeton University Press.
Clement W. L., Tebbitt M. C., Forrest L. L., Blair J. E., Brouillet L., Eriksson T., Swensen S. M. Phylogenetic position and biogeography of Hillebrandia sandwicensis (Begoniaceae): A rare Hawaiian relict. Syst. Biol. (2004) 91:905–917.
Cutler D. J. Estimating divergence times in the presence of an overdispersed molecular clock. Mol. Biol. Evol. (2000) 17:1647–1660.
Danilov I. G., Averianov A. O. A new species of Calamagras Cope, 1873 (Serpentes, Boidae, Erycinae) from the early Eocene of Kirghizia. Geodiversitas (1999) 21:85–91.
Datta P. M., Ray S. Earliest lizard from the Late Triassic (Carnian) of India. J. Vert. Paleontol. (2006) 26:795–800.[CrossRef]
Davison A. C., Hinkley D. V. Bootstrap methods and their application. (1997) Cambridge, UK: Cambridge University Press.
DiCiccio T. J., Efron B. More accurate confidence intervals in exponential families. Biometrika. (1992) 79:231–245.
DiCiccio T. J., Efron B. Bootstrap confidence intervals (with discussion). Stat. Sci. (1996) 11:189–228.[CrossRef][Web of Science]
Dowling H. G., Highton R., Maha G. C., Maxson L. R. Biochemical evaluation of colubrid snake phylogeny. J. Zool. Lond. (1983) 201:309–329.
Drummond A. J., Rambaut A. (2003) BEAST v1.0. Available at http://evolve.zoo.ox.ac.uk/beast/.
Drummond A. J., Ho S. Y. W, Phillips M. J., Rambaut A. Relaxed phylogenetics and dating with confidence. PLoS Biol. (2006) 4:1–12.[CrossRef][Web of Science]
Efron B. Bootstrap methods: Another look at the jackknife. Ann. Stat. (1979) 7:1–26.[CrossRef][Web of Science]
Efron B. Nonparametric standard error and confidence intervals. Can. J. Stat. (1981) 9:139–172.[CrossRef]
Efron B. Jackknife-after-bootstrap standard errors and influence functions (with discussion). J. R. Stat. Soc. B. (1992) 54:83–127.
Efron B., Tibshirani R. J. An introduction to the bootstrap (1993) New York: Chapman & Hall.
Felsenstein J. Confidence limits on phylogenies: An approach using the bootstrap. Evolution (1985) 39:783–791.[CrossRef][Web of Science]
Gelman A., Rubin D. B. Inference from iterative simulation using multiple sequences. Stat. Sci. (1992) 7:457–511.[CrossRef]
Hall P. The bootstrap and Edgeworth expansion (1992) New York: Springer-Verlag.
Head J. J., Holroyd P. A., Hutchison J. H., Ciochon R. I. First report of snakes (Serpentes) from the late middle Eocene Pondaung Formation, Myanmar. J. Vert. Paleo. (2005) 25:246–250.[CrossRef]
Hedges S. B. The number of replications needed for accurate estimation of the bootstrap p-value in phylogenetic studies. Mol. Biol. Evol. (1992) 9:366–369.[Web of Science][Medline]
Hillis D. M., Bull J. J. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. (1993) 42:182–192.
Holman J. A. Fossil snakes of North America: Origin, evolution, distribution, paleoecology (2000) Bloomington: Indiana University Press.
Huelsenbeck J. P., Ronquist F. MrBayes: Bayesian inference of phylogeny. Bioinformatics (2001) 17:754–755.
Hugall A. F., Foster R., Michael S., Lee Y. Calibration choice, rate smoothing, and the pattern of tetrapod diversification according to the long nuclear gene RAG-1. Syst. Biol. (2007) 56:543–563.
Ivanov M. The oldest known Miocene snake fauna from Central Europe: Merkur-North locality, Czech Republic. Acta Palaeontol. Pol. (2002) 47:513–534.[Web of Science]
Kass R. E., Raftery A. E. Bayes factors and model uncertainty. J. Am. Stat. Assoc. (1995) 90:773–795.[CrossRef][Web of Science]
Lawson R., Slowinski J. B., Crother B. I., Burbrink F. T. Phylogeny of Colubroidea (Serpentes): New evidence from mitochondrial and nuclear genes. Mol. Phylogenet. Evol. (2005) 37:581–601.[CrossRef][Web of Science][Medline]
Magallón S., Sanderson M. J. Absolute diversification rates in angiosperm clades. Evolution (2001) 55:1762–1780.[CrossRef][Web of Science][Medline]
Nagy Z. T., Joger U., Wink M., Glaw F., Vences M. Multiple colonization of Madagascar and Socotra by colubrid snakes: Evidence from nuclear and mitochondrial gene phylogenies. Proc. R. Soc. Lond. B. (2003) 270:2613–2621.[CrossRef][Medline]
Near T. J. Estimating divergence times of notothenioid fishes using a fossil-calibrated molecular clock. Antarctic Sci. (2004) 1:37–44.
Near T. J., Meylan P. A., Shaffer H. B. Assessing concordance of fossil calibration points in molecular clock studies: An example using turtles. Am. Nat. (2005) 165:137–146.[CrossRef][Web of Science][Medline]
Near T. J., Sanderson M. J. Assessing the quality of molecular divergence time estimates by fossil calibrations and fossil-based model selection. Phil. Trans. R. Soc. B. (2004) 359:1477–1483.[CrossRef][Medline]
Newton M. A., Raftery A. E. Approximate Bayesian inference with the weighted likelihood bootstrap. J. R. Stat. Soc. B. (1994) 56:3–48.
Ocampo A., Vajda V., Buffetaut E. Unravelling the Cretaceous-Paleogene (KT) catastrophe: Evidence from flora fauna and geology. In: Biological processes associated with impact events—Cockell C., Koeberl C., Gilmour I., eds. (2006) New York: Springer-Verlag. 203–227.
Pereira S. L., Baker A. J. A mitogenomic timescale for birds detects variable phylogenetic rates of molecular evolution and refutes the standard molecular clock. Mol. Biol. Evol. (2006) 23:1731–1740.
Pope K. O., D'Hondt S. L., Marshall C. R. Meteorite impact and the mass extinction of species at the Cretaceous/Tertiary boundary. Proc. Natl. Acad. Sci. USA. (1998) 95:11028–11029.
Pough H. F., Andrews R. M., Cadle J. E., Crump M. L., Savitzky A. H., Wells K. D. Herpetology (2003) Upper Saddle River, New Jersey: Pearson Prentice Hall.
Rage J. C. Fossil history. In: Snakes: Ecology and evolutionary biology—Seigel R. A., Collins J. T., Novak S. S., eds. (1987) New York: MacMillan. 49–76.
Raup D. M., Sepkoski J. J. Periodicity of extinctions in the geologic past. Proc. Natl. Acad. Sci. USA. (1984) 81:801–805.
Ronquist F., Huelsenbeck J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics (2003) 19:1572–1574.
Sanderson M. J. Estimating absolute rates of molecular evolution and divergence times: A penalized likelihood approach. Mol. Biol. Evol. (2002) 19:101–109.
Sanderson M. J. r8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics (2003) 19:301–302.
Sanderson M. J. r8s v1.70. (2004) Computer program and documentation available from http://ginger.ucdavis.edu/r8s/index.html.
Sanderson M. J., Doyle J. A. Sources of error and confidence intervals in estimating the age of angiosperms from rbcL and 18S rDNA data. Am. J. Bot. (2001) 88:1499–1516.
Sheehan P. M., Fastovsky D. E., Hoffmann R. G., Berghaus C. B., Gabriel D. L. Sudden extinction of the dinosaurs: Latest Cretaceous, upper Great Plains, USA. Science (1991) 254:835–839.
Simmons N. B. An Eocene big bang for bats. Science (2005) 307:527–528.
Springer M. S., Murphy W. J., Eizirik E., O'Brien S. J. Placental mammal diversification and the Cretaceous-Tertiary boundary. Proc. Natl. Acad. Sci. USA (2003) 100:1056–1061.
Stamatakis A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics (2006) 22:2688–2690.
Swofford D. L. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10 (2002) Sunderland, Massachusetts: Sinauer Associates.
Szyndlar Z., Rage J. C. Fossil record of the true vipers. In: Biology of the vipers—Schuett G. W., Höggren M., Douglas M. E., Greene H. W., eds. (2002) Eagle Mountain, Colorado: Eagle Mountain Press. 419–444.
Tchernov E., Rieppel O., Zaher H., Polcyn M. J., Jacobs L. L. A fossil snake with limbs. Science (2000) 287:2010–2012.
Teeling E. C., Springer M. S., Madsen O., Bates P., O'Brien S. J., Murphy W. J. A molecular phylogeny for bats illuminates biogeography and the fossil record. Science (2005) 307:580–584.
Thorne J. L., Kishino H. Divergence time estimation and rate evolution with multilocus data sets. Syst. Biol. (2002) 51:689–702.
Thorne J. L., Kishino H. Estimation of divergence times from molecular sequence data. In: Statistical methods for molecular evolution—Nielsen R., ed. (2006) New York: Springer-Verlag. 223–258.
Thorne J. L., Kishino H., Painter I. S. Estimating the rate of evolution of the rate of molecular evolution. Mol. Biol. Evol. (1998) 15:1647–1657.[Abstract]
Thorpe R. S., Leadbeater D. L., Pook C. E. Molecular clock and geological dates: Cytochrome b of Anolis extremus substantially contradicts dating of Barbados emergence. Mol. Ecol. (2004) 14:2087–2096.
Wu C. I., Li W. H. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA (1985) 82:1741–1745.
Yesson C., Culham A. Phyloclimatic modeling: Combining phylogenetics and bioclimatic modeling. Syst. Biol. (2006) 55:785–802.
Zuckerkandl E., Pauling L. Molecular disease, evolution, and genetic heterogeneity. In: Horizons in biochemistry—Kasha M., Pullman B., eds. (1962) New York: Academic Press. 189–225.
This article has been cited by other articles:
![]() |
J. A. Schulte II and F. Moreno-Roark Live birth among Iguanian lizards predates Pliocene-Pleistocene glaciations Biol Lett, October 7, 2009; (2009) rsbl.2009.0707v1. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





