Skip Navigation


Systematic Biology Advance Access originally published online on June 29, 2009
Systematic Biology 2009 58(2):199-210; doi:10.1093/sysbio/syp015
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
58/2/199    most recent
syp015v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Seo, T.-K.
Right arrow Articles by Kishino, H.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Seo, T.-K.
Right arrow Articles by Kishino, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Society of Systematic Biologists

Statistical Comparison of Nucleotide, Amino Acid, and Codon Substitution Models for Evolutionary Analysis of Protein-Coding Sequences

Tae-Kun Seo1,* and Hirohisa Kishino2

1 Professional Programme for Agricultural Bioinformatics
2 Laboratory of Biometrics and Bioinformatics, Graduate School of Agricultural and Life Sciences, University of Tokyo, 1-1-1 Yayoi Bunkyo-Ku, Tokyo 113-8657, Japan

* Correspondence to be sent to: Professional Programme for Agricultural Bioinformatics, Graduate School of Agricultural and Life Sciences, University of Tokyo, 1-1-1 Yayoi Bunkyo-Ku, Tokyo 113-8657, Japan; E-mail: seo{at}iu.a.u-tokyo.ac.jp.


   Abstract

Statistical models for the evolution of molecular sequences play an important role in the study of evolutionary processes. For the evolutionary analysis of protein-coding sequences, 3 types of evolutionary models are available: 1) nucleotide, 2) amino acid, and 3) codon substitution models. Selecting appropriate models can greatly improve the estimation of phylogenies and divergence times and the detection of positive selection. Although much attention has been paid to the comparisons among the same types of models, relatively little attention has been paid to the comparisons among the different types of models. Additionally, because such models have different data structures, comparison of those models using conventional model selection criteria such as Akaike information criterion (AIC) or Bayesian information criterion (BIC) is not straightforward. Here, we suggest new procedures to convert models of the above-mentioned 3 types to 64-dimensional models with nucleotide triplet substitution. These conversion procedures render it possible to statistically compare the models of these 3 types by using AIC or BIC. By analyzing divergent and conserved interspecific mammalian sequences and intraspecific human population data, we show the superiority of the codon substitution models and discuss the advantages and disadvantages of the models of the 3 types.

Keywords: AIC; amino acid model; BIC; codon model; likelihood ratio test; model comparison; nucleotide model

Received June 7, 2008; Revised August 26, 2008; Accepted January 5, 2009


Associate Editor: Marc Suchard


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.