Skip Navigation

Systematic Biology 2006 55(2):259-269; doi:10.1080/10635150500541599
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Mateiu, L.
Right arrow Articles by Rannala, B.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Mateiu, L.
Right arrow Articles by Rannala, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2006 Society of Systematic Biologists

Inferring Complex DNA Substitution Processes on Phylogenies Using Uniformization and Data Augmentation

Ligia Mateiu1 and Bruce Rannala2

1 Department of Medical Genetics, University of Alberta Edmonton, Alberta, Canada
2 Genome Center and Section of Evolution and Ecology, University of California Davism One Shields Avenue, Davis, California 95616 USA E-mail: brannala{at}ucdavis.edu

Edited by Ron Debry: Associate Editor


   Abstract

A new method is developed for calculating sequence substitution probabilities using Markov chain Monte Carlo (MCMC) methods. The basic strategy is to use uniformization to transform the original continuous time Markov process into a Poisson substitution process and a discrete Markov chain of state transitions. An efficient MCMC algorithm for evaluating substitution probabilities by this approach using a continuous gamma distribution to model site-specific rates is outlined. The method is applied to the problem of inferring branch lengths and site-specific rates from nucleotide sequences under a general time-reversible (GTR) model and a computer program BYPASSR is developed. Simulations are used to examine the performance of the new program relative to an existing program BASEML that uses a discrete approximation for the gamma distributed prior on site-specific rates. It is found that BASEML and BYPASSR are in close agreement when inferring branch lengths, regardless of the number of rate categories used, but that BASEML tends to underestimate high site-specific substitution rates, and to overestimate intermediate rates, when fewer than 50 rate categories are used. Rate estimates obtained using BASEML agree more closely with those of BYPASSR as the number of rate categories increases. Analyses of the posterior distributions of site-specific rates from BYPASSR suggest that a large number of taxa are needed to obtain precise estimates of site-specific rates, especially when rates are very high or very low. The method is applied to analyze 45 sequences of the alpha 2B adrenergic receptor gene (A2AB) from a sample of eutherian taxa. In general, the pattern expected for regions under negative selection is observed with third codon positions having the highest inferred rates, followed by first codon positions and with second codon positions having the lowest inferred rates. Several sites show exceptionally high substitution rates at second codon positions that may represent the effects of positive selection.

Keywords: Bayesian phylogenetic inference; Markov process; Metropolis-Hastings algorithm; molecular evolution; site-specific rates

Received July 15, 2005; Revised October 25, 2005; Accepted October 25, 2005
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
N. Lartillot, T. Lepage, and S. Blanquart
PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating
Bioinformatics, September 1, 2009; 25(17): 2286 - 2288.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
N. Rodrigue, H. Philippe, and N. Lartillot
Exploring Fast Computational Strategies for Probabilistic Phylogenetic Analysis
Syst Biol, October 1, 2007; 56(5): 711 - 726.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.